Reinforcement learning ohne Backpropagation in Neural Regulatory Networks : eine erste Abschätzung : a preliminary assessment

Lemmel, Julian

doi:10.34726/hss.2020.81325

DC Field

Value

Language

dc.contributor.advisor

Grosu, Radu

dc.contributor.author

Lemmel, Julian

dc.date.accessioned

2020-10-01T08:23:38Z

dc.date.issued

2020

dc.date.submitted

2020

dc.identifier.citation

<div class="csl-bib-body"> <div class="csl-entry">Lemmel, J. (2020). <i>Reinforcement learning ohne Backpropagation in Neural Regulatory Networks : eine erste Abschätzung : a preliminary assessment</i> [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2020.81325</div> </div>

dc.identifier.uri

https://doi.org/10.34726/hss.2020.81325

dc.identifier.uri

http://hdl.handle.net/20.500.12708/15747

dc.description.abstract

Reinforcement Learning (RL) aims at creating controllers for discrete and continuous problems and was initially inspired by neuroscience. However, the most successful methods are relying on backpropagation for calculating the gradients of the loss-function. The backpropagation algorithm is considered to be biologically implausible suggesting that it will not suffice when striving for human-like learning abilities. Neuroscience has brought forth different models of synaptic plasticity by observing isolated neurons. Such models could serve as alternatives to the ubiquitous backpropagation algorithm for calculating changes to network parameters. Neural Regulatory Networks are special RNNs whose inner states are calculated according to dynamics derived from biological observations. In this thesis, a novel framework based on state-of-the-art RL techniques and using NRNs, is introduced and experimented with by applying it to a cartpole balancing task. Two different methods of incorporating learning rules based on models of synaptic plasticity are investigated: the custom gradients method replaces the real gradient calculated by backpropagation with a biologically plausible synaptic plasticity rule, the plasticity dynamics method leaves the gradients unchanged but introduces additional plasticity dynamics that act throughout the entire unrolling of network states. Both methods were tested with three different learning rules: hebb’s rule, oja’s rule and the BCM rule. The results suggest that training can be accelerated when using the BCM rule.

dc.language

English

dc.language.iso

dc.rights.uri

http://rightsstatements.org/vocab/InC/1.0/

dc.subject

Backpropagation

dc.title

Reinforcement learning ohne Backpropagation in Neural Regulatory Networks : eine erste Abschätzung : a preliminary assessment

dc.title.alternative

Reinforcement learning without backpropagation in neural regulatory netzworks

dc.type

Thesis

dc.type

Hochschulschrift

dc.rights.license

In Copyright

dc.rights.license

Urheberrechtsschutz

dc.identifier.doi

10.34726/hss.2020.81325

dc.contributor.affiliation

TU Wien, Österreich

dc.rights.holder

Julian Lemmel

dc.publisher.place

Wien

tuw.version

vor

tuw.thesisinformation

Technische Universität Wien

tuw.publication.orgunit

E191 - Institut für Computer Engineering

dc.type.qualificationlevel

Diploma

dc.identifier.libraryid

AC15760934

dc.description.numberOfPages

dc.thesistype

Diplomarbeit

dc.thesistype

Diploma Thesis

tuw.author.orcid

0000-0002-3517-2860

dc.rights.identifier

In Copyright

dc.rights.identifier

Urheberrechtsschutz

tuw.advisor.staffStatus

staff

tuw.advisor.orcid

0000-0001-5715-2142

item.languageiso639-1

item.openairetype

master thesis

item.grantfulltext

open

item.fulltext

with Fulltext

item.cerifentitytype

Publications

item.mimetype

application/pdf

item.openairecristype

http://purl.org/coar/resource_type/c_bdcc

item.openaccessfulltext

Open Access

crisitem.author.dept

E191-01 - Forschungsbereich Cyber-Physical Systems

crisitem.author.orcid

0000-0002-3517-2860

crisitem.author.parentorg

E191 - Institut für Computer Engineering

Appears in Collections:

Thesis

Fulltext (Version of Record (published version))

Adobe PDF

(1.21 MB)

In Copyright

Show simple item record

Page view(s)

414

checked on Nov 23, 2023

Download(s)

159

checked on Nov 23, 2023

Google Scholar^TM

Check

Page view(s)

Download(s)

Google ScholarTM

Google Scholar^TM