Evaluating Reinforcement-Learning-based Sepsis Treatments via Tabular and Continuous Stationary Distribution Correction Estimation

Weiss, Richard

doi:10.34726/hss.2025.126219

Datensatz Zitierlink:

https://doi.org/10.34726/hss.2025.126219
http://hdl.handle.net/20.500.12708/208809

Titel:

Evaluating Reinforcement-Learning-based Sepsis Treatments via Tabular and Continuous Stationary Distribution Correction Estimation

Zitat:

Weiss, R. (2025). Evaluating Reinforcement-Learning-based Sepsis Treatments via Tabular and Continuous Stationary Distribution Correction Estimation [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2025.126219

reposiTUm-DOI:

10.34726/hss.2025.126219

CatalogPlus:

AC17408860

Publikationstyp:

Hochschulschrift - Diplomarbeit

Sprache:

Englisch

Autor_innen:

Weiss, Richard

Betreuer_in:

Heitzinger, Clemens

Organisationseinheit:

E194 - Institut für Information Systems Engineering

Datum (veröffentlicht):

2025

Umfang:

110

Keywords:

Reinforcement learning; Policy evaluation; Distribution correction estimation; Medical treatment policy

Abstract:

This work presents the results of state-of-the-art offline behavior agnostic policy evaluation algorithms based on stationary distribution correction estimation, evaluated within a healthcare setting using data from the AmsterdamUMCdb. We firstly, present the theory of these algorithms. This includes the introduction of four tabular estimators and a revision of the well known DualDICE, GenDICE, and GradientDICE. All algorithms are implemented in a modular open source Python library. In order to evaluate the efficacy of the algorithms, they are tested in the environments BoyanChain as well as the OpenAI Gym applications FrozenLake, Taxi, and Cartpole. The continuous state space algorithms DualDICE, GenDICE, and GradientDICE are run directly on the healthcare dataset. Additionally, the state space of healthcare applications is clustered in order to perform policy evaluation in the tabular setting. Our analysis provides a comprehensive examination of the practical functioning of all estimators, elucidating the underlying theory and the connections between the results and the theory.

Weitere Information:

Arbeit an der Bibliothek noch nicht eingelangt - Daten nicht geprüft
Abweichender Titel nach Übersetzung der Verfasserin/des Verfassers

Lizenz:

Urheberrechtsschutz

Enthalten in den Sammlungen:

Thesis