Neufeld, E., Ciabattoni, A., & Tulcan, R. F. (2024). Norm Compliance in Reinforcement Learning Agents via Restraining Bolts. In J. Savelka, J. Harasta, T. Novotna, & J. Misek (Eds.), Legal Knowledge and Information Systems (pp. 119–130). https://doi.org/10.3233/FAIA241239
ethical reinforcement learning; safe reinforcement learning; LTL over finite traces
en
Abstract:
We modify the restraining bolt technique, originally designed for safe reinforcement learning, to regulate agent behavior in alignment with social, ethical, and legal norms. Rather than maximizing rewards for norm compliance, our approach minimizes penalties for norm violations. We demonstrate in case studies the effectiveness of our approach in capturing benchmark challenges in normative reasoning like contrary-to-duty obligations, exceptions, and temporal obligations.