<div class="csl-bib-body">
<div class="csl-entry">Pranger, S., Chockler, H., Tappler, M., & Könighofer, B. (2024). Test Where Decisions Matter: Importance-driven Testing for Deep Reinforcement Learning. In A. Globerson, L. Mackey, & D. Belgrave (Eds.), <i>Advances in Neural Information Processing Systems 37 (NeurIPS 2024)</i>. http://hdl.handle.net/20.500.12708/213273</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/213273
-
dc.description.abstract
In many Deep Reinforcement Learning (RL) problems, decisions in a trained policy vary in significance for the expected safety and performance of the policy. Since RL policies are very complex, testing efforts should concentrate on states in which the agent's decisions have the highest impact on the expected outcome. In this paper, we propose a novel model-based method to rigorously compute a ranking of state importance across the entire state space. We then focus our testing efforts on the highest-ranked states. In this paper, we focus on testing for safety. However, the proposed methods can be easily adapted to test for performance. In each iteration, our testing framework computes optimistic and pessimistic safety estimates. These estimates provide lower and upper bounds on the expected outcomes of the policy execution across all modeled states in the state space. Our approach divides the state space into safe and unsafe regions upon convergence, providing clear insights into the policy's weaknesses. Two important properties characterize our approach. (1) Optimal Test-Case Selection: At any time in the testing process, our approach evaluates the policy in the states that are most critical for safety. (2) Guaranteed Safety: Our approach can provide formal verification guarantees over the entire state space by sampling only a fraction of the policy. Any safety properties assured by the pessimistic estimate are formally proven to hold for the policy. We provide a detailed evaluation of our framework on several examples, showing that our method discovers unsafe policy behavior with low testing effort.
en
dc.description.sponsorship
WWTF Wiener Wissenschafts-, Forschu und Technologiefonds
-
dc.language.iso
en
-
dc.subject
Policy Verification
en
dc.subject
Probabilistic Model Checking
en
dc.subject
Deep Reinforcement Learning
en
dc.title
Test Where Decisions Matter: Importance-driven Testing for Deep Reinforcement Learning
en
dc.type
Inproceedings
en
dc.type
Konferenzbeitrag
de
dc.contributor.affiliation
Graz University of Technology, Austria
-
dc.contributor.affiliation
King's College London, United Kingdom of Great Britain and Northern Ireland (the)
-
dc.contributor.affiliation
Graz University of Technology, Austria
-
dc.relation.grantno
ICT22-023
-
dc.type.category
Full-Paper Contribution
-
tuw.booktitle
Advances in Neural Information Processing Systems 37 (NeurIPS 2024)
-
tuw.peerreviewed
true
-
tuw.book.ispartofseries
NeurIPS Proceedings
-
tuw.project.title
Training and Guiding AI Agents with Ethical Rules
-
tuw.researchTopic.id
I1
-
tuw.researchTopic.id
C4
-
tuw.researchTopic.id
C6
-
tuw.researchTopic.name
Logic and Computation
-
tuw.researchTopic.name
Mathematical and Algorithmic Foundations
-
tuw.researchTopic.name
Modeling and Simulation
-
tuw.researchTopic.value
34
-
tuw.researchTopic.value
33
-
tuw.researchTopic.value
33
-
tuw.publication.orgunit
E191-01 - Forschungsbereich Cyber-Physical Systems
-
dc.description.numberOfPages
24
-
tuw.author.orcid
0009-0000-6011-9925
-
tuw.author.orcid
0000-0003-1219-0713
-
tuw.author.orcid
0000-0001-5183-5452
-
tuw.event.name
Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, NeurIPS 2024
-
tuw.event.startdate
10-12-2024
-
tuw.event.enddate
15-12-2024
-
tuw.event.online
On Site
-
tuw.event.type
Event for scientific audience
-
tuw.event.place
Vancouver
-
tuw.event.country
CA
-
tuw.event.presenter
Pranger, Stefan
-
tuw.event.track
Multi Track
-
wb.sciencebranch
Informatik
-
wb.sciencebranch
Elektrotechnik, Elektronik, Informationstechnik
-
wb.sciencebranch
Mathematik
-
wb.sciencebranch.oefos
1020
-
wb.sciencebranch.oefos
2020
-
wb.sciencebranch.oefos
1010
-
wb.sciencebranch.value
65
-
wb.sciencebranch.value
10
-
wb.sciencebranch.value
25
-
item.grantfulltext
none
-
item.fulltext
no Fulltext
-
item.languageiso639-1
en
-
item.openairecristype
http://purl.org/coar/resource_type/c_5794
-
item.openairetype
conference paper
-
item.cerifentitytype
Publications
-
crisitem.project.funder
WWTF Wiener Wissenschafts-, Forschu und Technologiefonds
-
crisitem.project.grantno
ICT22-023
-
crisitem.author.dept
Graz University of Technology, Austria
-
crisitem.author.dept
King's College London, United Kingdom of Great Britain and Northern Ireland (the)
-
crisitem.author.dept
E191-01 - Forschungsbereich Cyber-Physical Systems