<div class="csl-bib-body">
<div class="csl-entry">Böck, M., & Heitzinger, C. (2022). Speedy Categorical Distributional Reinforcement Learning and Complexity Analysis. <i>SIAM Journal on the Mathematics of Data Science</i>, <i>4</i>(2), 675–693. https://doi.org/10.1137/20M1364436</div>
</div>
-
dc.identifier.issn
2577-0187
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/139382
-
dc.description.abstract
In distributional reinforcement learning, the entire distribution of the return instead of just the expected return is modeled. The approach with categorical distributions as the approximation method is well-known in Q-learning, and convergence results have been established in the tabular case. In this work, speedy Q-learning is extended to categorical distributions, a finite-time analysis is performed, and probably approximately correct bounds in terms of the Cramér distance are established. It is shown that also in the distributional case the new update rule yields faster policy evaluation in comparison to the standard Q-learning one and that the sample complexity is essentially the same as the one of the value-based algorithmic counterpart. Without the need for more state-action-reward samples, one gains significantly more information about the return with categorical distributions. Even though the results do not easily extend to the case of policy control, a slight modification to the update rule yields promising numerical results.
en
dc.language.iso
en
-
dc.publisher
SIAM PUBLICATIONS
-
dc.relation.ispartof
SIAM Journal on the Mathematics of Data Science
-
dc.subject
reinforcement learning
en
dc.subject
distributional reinforcement learning
en
dc.subject
Q-learning
en
dc.subject
PAC bounds
en
dc.subject
complexity analysis
en
dc.title
Speedy Categorical Distributional Reinforcement Learning and Complexity Analysis
en
dc.type
Article
en
dc.type
Artikel
de
dc.description.startpage
675
-
dc.description.endpage
693
-
dcterms.dateSubmitted
2020-09-04
-
dc.type.category
Original Research Article
-
tuw.container.volume
4
-
tuw.container.issue
2
-
tuw.journal.peerreviewed
true
-
tuw.peerreviewed
true
-
tuw.researchTopic.id
C6
-
tuw.researchTopic.name
Modeling and Simulation
-
tuw.researchTopic.value
100
-
dcterms.isPartOf.title
SIAM Journal on the Mathematics of Data Science
-
tuw.publication.orgunit
E101-03-2 - Forschungsgruppe Maschinelles Lernen und Unsicherheitsquantifizierung
-
tuw.publisher.doi
10.1137/20M1364436
-
dc.identifier.eissn
2577-0187
-
dc.description.numberOfPages
19
-
wb.sci
true
-
wb.sciencebranch
Mathematik
-
wb.sciencebranch.oefos
1010
-
wb.sciencebranch.value
100
-
item.openairetype
Article
-
item.openairetype
Artikel
-
item.grantfulltext
none
-
item.cerifentitytype
Publications
-
item.cerifentitytype
Publications
-
item.languageiso639-1
en
-
item.openairecristype
http://purl.org/coar/resource_type/c_18cf
-
item.openairecristype
http://purl.org/coar/resource_type/c_18cf
-
item.fulltext
no Fulltext
-
crisitem.author.dept
E194-01 - Forschungsbereich Information und Software Engineering
-
crisitem.author.dept
E101-03-2 - Forschungsgruppe Maschinelles Lernen und Unsicherheitsquantifizierung
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering
-
crisitem.author.parentorg
E101-03 - Forschungsbereich Scientific Computing and Modelling