<div class="csl-bib-body">
<div class="csl-entry">Neufeld, E. A., Bartocci, E., & Ciabattoni, A. (2022). On Normative Reinforcement Learning via Safe Reinforcement Learning. In <i>PRIMA 2022: Principles and Practice of Multi-Agent Systems - Proceedings</i> (pp. 72–89). https://doi.org/10.1007/978-3-031-21203-1_5</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/136993
-
dc.description.abstract
Reinforcement learning (RL) has proven a successful technique for teaching autonomous agents goal-directed behaviour. As RL agents further integrate with our society, they must learn to comply with ethical, social, or legal norms. Defeasible deontic logics are natural formal frameworks to specify and reason about such norms in a transparent way. However, their effective and efficient integration in RL agents remains an open problem. On the other hand, linear temporal logic (LTL) has been successfully employed to synthesize RL policies satisfying, e.g., safety requirements. In this paper, we investigate the extent to which the established machinery for safe reinforcement learning can be leveraged for directing normative behaviour for RL agents. We analyze some of the difficulties that arise from attempting to represent norms with LTL, provide an algorithm for synthesizing LTL specifications from certain normative systems, and analyze its power and limits with a case study.
en
dc.description.sponsorship
WWTF Wiener Wissenschafts-, Forschu und Technologiefonds
-
dc.language.iso
en
-
dc.relation.ispartofseries
Lecture Notes in Computer Science
-
dc.subject
Safe Reinforcement Learning
en
dc.subject
temporal logic
en
dc.subject
deontic logic
en
dc.title
On Normative Reinforcement Learning via Safe Reinforcement Learning
en
dc.type
Inproceedings
en
dc.type
Konferenzbeitrag
de
dc.relation.isbn
978-3-031-21203-1
-
dc.relation.doi
10.1007/978-3-031-21203-1_5
-
dc.description.startpage
72
-
dc.description.endpage
89
-
dc.relation.grantno
MA16-028
-
dcterms.dateSubmitted
2022
-
dc.type.category
Full-Paper Contribution
-
tuw.booktitle
PRIMA 2022: Principles and Practice of Multi-Agent Systems - Proceedings
-
tuw.container.volume
13753
-
tuw.peerreviewed
true
-
tuw.book.ispartofseries
Lecture Notes in Computer Science (LNCS)
-
tuw.project.title
Werkzeuge für logisches Schließen in der Deontischen Logik und Anwendungen auf heilige indische Schriften