We introduce a biologically plausible RL framework for solving tasks in partially observable Markov decision processes (POMDPs). The proposed algorithm combines three integral parts: (1) A Meta-RL architecture, resembling the mammalian basal ganglia; (2) A biologically plausible reinforcement learning algorithm, exploiting temporal difference learning and eligibility traces to train the policy and the value-function; (3) An online automatic differentiation algorithm for computing the gradients with respect to parameters of a shared recurrent network backbone. Our experimental results show that the method is capable of solving a diverse set of partially observable reinforcement learning tasks. The algorithm we call real-time recurrent reinforcement learning (RTRRL) serves as a model of learning in biological neural networks, mimicking reward pathways in the basal ganglia.
en
dc.language.iso
en
-
dc.subject
Reinforcement Learning
en
dc.subject
Recurrent Neural Networks
en
dc.subject
Biologically-plausible Deep Learning
en
dc.subject
Temporal-Difference Learning
en
dc.subject
Eligibility Traces
en
dc.subject
Real-Time Recurrent Learning
en
dc.subject
Online Learning
en
dc.subject
Continual Learning
en
dc.title
Real-Time Recurrent Reinforcement Learning
en
dc.type
Inproceedings
en
dc.type
Konferenzbeitrag
de
dc.relation.isbn
978-1-57735-897-8
-
dc.description.startpage
18189
-
dc.description.endpage
18197
-
dc.type.category
Full-Paper Contribution
-
tuw.booktitle
Proceedings of the 39th Annual AAAI Conference on Artificial Intelligence
-
tuw.container.volume
39/17
-
tuw.peerreviewed
true
-
tuw.relation.publisher
AAAI Press
-
tuw.relation.publisherplace
Washington DC
-
tuw.researchinfrastructure
Vienna Scientific Cluster
-
tuw.researchTopic.id
C4
-
tuw.researchTopic.id
I3
-
tuw.researchTopic.name
Mathematical and Algorithmic Foundations
-
tuw.researchTopic.name
Automation and Robotics
-
tuw.researchTopic.value
20
-
tuw.researchTopic.value
80
-
tuw.publication.orgunit
E191-01 - Forschungsbereich Cyber-Physical Systems
-
tuw.publication.orgunit
E056-15 - Fachbereich Resilient Embedded Systems
-
tuw.publication.orgunit
E056-17 - Fachbereich Trustworthy Autonomous Cyber-Physical Systems
-
tuw.publisher.doi
10.1609/aaai.v39i17.34001
-
dc.description.numberOfPages
9
-
tuw.author.orcid
0000-0002-3517-2860
-
tuw.author.orcid
0000-0001-5715-2142
-
tuw.event.name
39th Annual AAAI Conference on Artificial Intelligence
en
dc.description.sponsorshipexternal
FFG
-
dc.relation.grantnoexternal
FO999899799
-
tuw.event.startdate
25-02-2025
-
tuw.event.enddate
04-03-2025
-
tuw.event.online
Hybrid
-
tuw.event.type
Event for scientific audience
-
tuw.event.place
Philadelphia, Pennsylvania
-
tuw.event.country
US
-
tuw.event.institution
Association for the Advancement of Artificial Intelligence
-
tuw.event.presenter
Lemmel, Julian
-
tuw.event.track
Multi Track
-
wb.sciencebranch
Informatik
-
wb.sciencebranch
Neurowissenschaften
-
wb.sciencebranch
Mathematik
-
wb.sciencebranch.oefos
1020
-
wb.sciencebranch.oefos
3014
-
wb.sciencebranch.oefos
1010
-
wb.sciencebranch.value
70
-
wb.sciencebranch.value
20
-
wb.sciencebranch.value
10
-
item.languageiso639-1
en
-
item.fulltext
no Fulltext
-
item.cerifentitytype
Publications
-
item.openairecristype
http://purl.org/coar/resource_type/c_5794
-
item.openairetype
conference paper
-
item.grantfulltext
none
-
crisitem.author.dept
E191-01 - Forschungsbereich Cyber-Physical Systems
-
crisitem.author.dept
E191-01 - Forschungsbereich Cyber-Physical Systems