Weiss, R. (2025). Evaluating Reinforcement-Learning-based Sepsis Treatments via Tabular and Continuous Stationary Distribution Correction Estimation [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2025.126219
E194 - Institut für Information Systems Engineering
-
Date (published):
2025
-
Number of Pages:
110
-
Keywords:
Reinforcement learning; Policy evaluation; Distribution correction estimation; Medical treatment policy
en
Abstract:
This work presents the results of state-of-the-art offline behavior agnostic policy evaluation algorithms based on stationary distribution correction estimation, evaluated within a healthcare setting using data from the AmsterdamUMCdb. We firstly, present the theory of these algorithms. This includes the introduction of four tabular estimators and a revision of the well known DualDICE, GenDICE, and GradientDICE. All algorithms are implemented in a modular open source Python library. In order to evaluate the efficacy of the algorithms, they are tested in the environments BoyanChain as well as the OpenAI Gym applications FrozenLake, Taxi, and Cartpole. The continuous state space algorithms DualDICE, GenDICE, and GradientDICE are run directly on the healthcare dataset. Additionally, the state space of healthcare applications is clustered in order to perform policy evaluation in the tabular setting. Our analysis provides a comprehensive examination of the practical functioning of all estimators, elucidating the underlying theory and the connections between the results and the theory.
en
Additional information:
Arbeit an der Bibliothek noch nicht eingelangt - Daten nicht geprüft Abweichender Titel nach Übersetzung der Verfasserin/des Verfassers