Mayer, F. (2024). Applications of concentration inequalities in distributional reinforcement learning [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2025.124987
Distributional reinforcement learning extends traditional reinforcement learning by modeling the entire distribution of returns, providing several advantages, such as insight into potential outcomes and associated risks. However, this approach results in higher computational complexity. This thesis investigates the application of different concentration inequalities, specifically the Hoeffding, Bernstein, and Bennett inequalities to find tighter bounds on the Cramérdistance between the estimated reward distributions and the true reward distribution. Tighter bounds enhance the analysis of algorithms, such as the speedy Q-learning algorithm within the distributional reinforcement learning framework. To validate the theoretical findings, a complexity analysis is conducted to determine which inequality provides the most robust and reliable bounds under varying accuracy requirements and environmental complexities. In addition to that, simulation studies are performed using the Taxi and FrozenLake environments from the Gymnasium library in Python. These simulations compare theperformance of each inequality and observe their impact on the convergence behavior ofthe learning algorithms. The tightest bound on the Cramér distance is achieved using the Bennett inequality, followed by the bound obtained through the Bernstein inequality. However, when the number of training episodes is small, the bound derived from the Hoeffding inequality exceeds the Bernstein bound in terms of tightness.
en
Additional information:
Abweichender Titel nach Übersetzung der Verfasserin/des Verfassers