<div class="csl-bib-body">
<div class="csl-entry">Eisenhut, J., Torralba, Á., Christakis, M., & Hoffmann, J. (2023). Automatic Metamorphic Test Oracles for Action-Policy Testing. In S. Koenig, R. Stern, & M. Vallati (Eds.), <i>Proceedings of the Thirty-Third International Conference on Automated Planning and Scheduling</i> (pp. 109–117). Association for the Advancement of Artificial Intelligence. https://doi.org/10.1609/icaps.v33i1.27185</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/188181
-
dc.description.abstract
Testing is a promising way to gain trust in learned action policies π. Prior work on action-policy testing in AI planning formalized bugs as states t where π is sub-optimal with respect to a given testing objective. Deciding whether or not t is a bug is as hard as (optimal) planning itself. How can we design test oracles able to recognize some states t to be bugs efficiently? Recent work introduced metamorphic oracles which compare policy behavior on state pairs (s,t) where t is easier to solve; if π performs worse on t than on s, we know that t is a bug. Here, we show how to automatically design such oracles in classical planning, based on simulation relations between states. We introduce two oracle families of this kind: first, morphing query states t to obtain suitable s; second, maintaining and comparing upper bounds on h* across the states encountered during testing. Our experiments on ASNet policies show that these oracles can find bugs much more quickly than the existing alternatives, which are search-based; and that the combination of our oracles with search-based ones almost consistently dominates all other oracles.
en
dc.language.iso
en
-
dc.relation.ispartofseries
Proceedings of the ... International Conference on Automated Planning and Scheduling
-
dc.rights.uri
http://rightsstatements.org/vocab/InC/1.0/
-
dc.subject
classical planning techniques and analysis
en
dc.subject
metamorphic oracles
en
dc.subject
action-policy testing
en
dc.title
Automatic Metamorphic Test Oracles for Action-Policy Testing