Reinforcement-learning-based, application-agnostic, and explainable auto-scaling in the cloud utilizing high-level SLOs

Mayerhofer, Robin

doi:10.34726/hss.2023.106505

Datensatz Zitierlink:

https://doi.org/10.34726/hss.2023.106505
http://hdl.handle.net/20.500.12708/188210

Titel:

Reinforcement-learning-based, application-agnostic, and explainable auto-scaling in the cloud utilizing high-level SLOs

Zitat:

Mayerhofer, R. (2023). Reinforcement-learning-based, application-agnostic, and explainable auto-scaling in the cloud utilizing high-level SLOs [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2023.106505

reposiTUm-DOI:

10.34726/hss.2023.106505

CatalogPlus:

AC16940127

Publikationstyp:

Hochschulschrift - Diplomarbeit

Sprache:

Englisch

Autor_innen:

Mayerhofer, Robin

Betreuer_in:

Dustdar, Schahram

Mitbetreuer_innen:

Morichetta, Andrea

Organisationseinheit:

E194 - Institut für Information Systems Engineering

Datum (veröffentlicht):

2023

Umfang:

Keywords:

Auto scaling; Service Level Objective; Reinforcement Learning; Q-learning; Q-Threshold; Polaris framework; Cloud Elasticity

Abstract:

Cloud computing is a widely adopted paradigm in the software industry. The ability to adapt the provisioned resources for an application based on the actual demand is called auto-scaling. Auto-scaling is crucial to keep costs within limits while ensuring sufficient performance. Effective auto-scaling is a multi-dimensional problem and an active area of research. The industry standard for auto-scaling is static thresholds based on low-level metrics such as CPU utilization, while researchers are experimenting with applying Machine Learning techniques to auto-scaling. Static thresholds are hard to set up and need to be manually corrected, and low-level metrics are disconnected from the business goals. On the other hand, Reinforcement Learning is a popular approach to autonomously learning an auto-scaling policy. While promising, RL introduces new problems to the auto-scaling domain, such as a lack of explainability and interpretability, complexity, long learning phases, and bad worst-case performance. We aim to find ways to efficiently auto-scale while bridging the gap between auto-scaling and business goals without the undesirable properties of RL solutions.This thesis presents two approaches to auto-scaling, Extended-Q-Threshold, and HPA-Q-Threshold, both building upon Q-Threshold, an auto-scaling system from the literature. Our auto-scalers are integrated into the Polaris framework, built with a flexible architecture in mind. We extend and adapt Q-Threshold, an approach to auto-scaling where the RL agent controls the usually static threshold, effectively making it dynamic. Our adaptations tackle experimentally identified shortcomings of the Q-Threshold approach. Furthermore, we generalize the approach and apply different scaling metrics and rewards. Thus, we enable the further development and evaluation of this promising approach.We show how a modern, flexible auto-scaler can be integrated with the Polaris framework and run in a Kubernetes cluster. Our experiments evaluate the effectiveness of our proposed adaptations and prove that some are necessary to prevent the identified issues of Q-Threshold. In contrast, others must be carefully assessed and tested, such as different scaling metrics and reward definitions. Overall, the adaptations help ensure the interpretability of the auto-scaler by utilizing a very lightweight implementation of RL. Furthermore, our auto-scaler possesses other positive characteristics, such as limiting the worst-case and guaranteeing acceptable early-stage performance.

Lizenz:

Urheberrechtsschutz

Enthalten in den Sammlungen:

Thesis