<div class="csl-bib-body">
<div class="csl-entry">Kolluri, A., Sharma, R., Costa, M., Köpf, B., Nießen, T., Russinovich, M., Tople, S., & Zanella Béguelin, S. (2026). Optimizing Agent Planning for Security and Autonomy. In <i>The Fourteenth International Conference on Learning Representations</i>. The Fourteenth International Conference on Learning Representations, Rio de Janeiro, Brazil.</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/228304
-
dc.description.abstract
Indirect prompt injection attacks threaten AI agents that execute consequential actions, motivating deterministic system-level defenses. Such defenses can provably block unsafe actions by enforcing confidentiality and integrity policies, but currently appear costly: they reduce task completion rates and increase token usage compared to probabilistic defenses. We argue that existing evaluations miss a key benefit of system-level defenses: reduced reliance on human oversight. We introduce autonomy metrics to quantify this benefit: the fraction of consequential actions an agent can execute without human-in-the-loop (HITL) approval while preserving security. To increase autonomy, we design a security-aware agent that (i) introduces richer HITL interactions, and (ii) explicitly plans for both task progress and policy compliance. We implement this agent design atop an existing information-flow control defense against prompt injection and evaluate it on the AgentDojo and WASP benchmarks. Experiments show that this approach yields higher autonomy without sacrificing utility (task completion).
en
dc.description.sponsorship
European Commission
-
dc.language.iso
en
-
dc.subject
AI Agents
en
dc.subject
Security
en
dc.subject
Prompt Injection Attacks
en
dc.subject
Information Flow Control
en
dc.subject
Autonomy
en
dc.title
Optimizing Agent Planning for Security and Autonomy
en
dc.type
Inproceedings
en
dc.type
Konferenzbeitrag
de
dc.relation.grantno
101034440
-
dc.type.category
Full-Paper Contribution
-
tuw.booktitle
The Fourteenth International Conference on Learning Representations
-
tuw.peerreviewed
true
-
tuw.project.title
Logics for Computer Science Program at TU Wien
-
tuw.researchTopic.id
I1
-
tuw.researchTopic.id
I4
-
tuw.researchTopic.name
Logic and Computation
-
tuw.researchTopic.name
Information Systems Engineering
-
tuw.researchTopic.value
80
-
tuw.researchTopic.value
20
-
tuw.publication.orgunit
E192-04 - Forschungsbereich Formal Methods in Systems Engineering
-
tuw.publication.orgunit
E056-13 - Fachbereich LogiCS
-
dc.description.numberOfPages
32
-
tuw.author.orcid
0000-0003-1792-4448
-
tuw.author.orcid
0000-0002-1928-1549
-
tuw.author.orcid
0009-0005-8004-0743
-
tuw.author.orcid
0000-0002-7712-0006
-
tuw.author.orcid
0009-0009-8306-0933
-
tuw.author.orcid
0000-0003-0479-9967
-
tuw.event.name
The Fourteenth International Conference on Learning Representations
en
tuw.event.startdate
23-04-2026
-
tuw.event.enddate
27-04-2026
-
tuw.event.online
On Site
-
tuw.event.type
Event for scientific audience
-
tuw.event.place
Rio de Janeiro
-
tuw.event.country
BR
-
tuw.event.presenter
Sharma, Rishi
-
wb.sciencebranch
Informatik
-
wb.sciencebranch
Mathematik
-
wb.sciencebranch.oefos
1020
-
wb.sciencebranch.oefos
1010
-
wb.sciencebranch.value
80
-
wb.sciencebranch.value
20
-
item.languageiso639-1
en
-
item.openairetype
conference paper
-
item.grantfulltext
none
-
item.openairecristype
http://purl.org/coar/resource_type/c_5794
-
item.cerifentitytype
Publications
-
item.fulltext
no Fulltext
-
crisitem.author.dept
E192-04 - Forschungsbereich Formal Methods in Systems Engineering