<div class="csl-bib-body">
<div class="csl-entry">Morichetta, A., Pusztai, T. W., Vij, D., Casamayor Pujol, V., Raith, P. A., Xiong, Y., Nastic, S., Dustdar, S., & Zhang, Z. (2023). Demystifying deep learning in predictive monitoring for cloud-native SLOs. In C. A. ARDAGNA, N. Atukorala, P. Beckmann, C. C. Chang, Chang Rong N., C. Evangelinos, J. Fan, G. Fox, J. Fox, C. Hagleitner, Z. Jin, T. Kosar, & M. Parashar (Eds.), <i>2023 IEEE 16th International Conference on Cloud Computing (CLOUD)</i> (pp. 1–11). IEEE. https://doi.org/10.1109/CLOUD60044.2023.00013</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/191153
-
dc.description.abstract
The complexity inherent in managing cloud computing systems calls for novel solutions that can effectively enforce high-level Service Level Objectives (SLOs) promptly. Unfortunately, most of the current SLO management solutions rely on reactive approaches, i.e., correcting SLO violations only after they have occurred. Further, the few methods that explore predictive techniques to prevent SLO violations focus solely on forecasting low-level system metrics, such as CPU and Memory utilization. Although valid in some cases, these metrics do not necessarily provide clear and actionable insights into application behavior. This paper presents a novel approach that directly predicts high-level SLOs using low-level system metrics. We target this goal by training and optimizing two state-of-the-art neural network models, a Short-Term Long Memory - LSTM, and a Transformer-based model. Our models provide actionable insights into application behavior by establishing proper connections between the evolution of low-level workload-related metrics and the high-level SLOs. We demonstrate our approach to selecting and preparing the data. We show in practice how to optimize LSTM and Transformer by targeting efficiency as a high-level SLO metric and performing a comparative analysis. We show how these models behave when the input workloads come from different distributions. Consequently, we demonstrate their ability to generalize in heterogeneous systems. Finally, we operationalize our two models by integrating them into the Polaris framework we have been developing to enable a performance-driven SLO-native approach to Cloud computing.
en
dc.language.iso
en
-
dc.subject
workload prediction
en
dc.subject
neural networks
en
dc.subject
cloud
en
dc.subject
LSTM
en
dc.subject
Transformers
en
dc.title
Demystifying deep learning in predictive monitoring for cloud-native SLOs
en
dc.type
Inproceedings
en
dc.type
Konferenzbeitrag
de
dc.contributor.affiliation
Futurewei Technologies, Inc., Santa Clara, CA, USA
-
dc.contributor.editoraffiliation
Argonne National Laboratory
-
dc.contributor.editoraffiliation
Iowa State University, United States of America (the)
-
dc.contributor.editoraffiliation
IBM Research, TJ Watson Research Center, USA
-
dc.contributor.editoraffiliation
IBM Research, Cambridge
-
dc.contributor.editoraffiliation
University of Virginia, United States of America (the)
-
dc.contributor.editoraffiliation
IBM Research Europe - Zurich Lab
-
dc.contributor.editoraffiliation
University at Buffalo, State University of New York, United States of America (the)
-
dc.relation.isbn
979-8-3503-0481-7
-
dc.relation.doi
10.1109/CLOUD60044.2023
-
dc.relation.issn
2159-6182
-
dc.description.startpage
1
-
dc.description.endpage
11
-
dc.type.category
Full-Paper Contribution
-
dc.relation.eissn
2159-6190
-
tuw.booktitle
2023 IEEE 16th International Conference on Cloud Computing (CLOUD)
-
tuw.relation.publisher
IEEE
-
tuw.researchTopic.id
I4
-
tuw.researchTopic.name
Information Systems Engineering
-
tuw.researchTopic.value
100
-
tuw.publication.orgunit
E194-02 - Forschungsbereich Distributed Systems
-
tuw.publisher.doi
10.1109/CLOUD60044.2023.00013
-
dc.description.numberOfPages
11
-
tuw.author.orcid
0000-0003-3765-3067
-
tuw.author.orcid
0000-0001-9765-6310
-
tuw.author.orcid
0000-0003-2830-8368
-
tuw.author.orcid
0000-0003-3293-9437
-
tuw.author.orcid
0000-0003-0410-6315
-
tuw.author.orcid
0000-0001-6872-8821
-
tuw.editor.orcid
0000-0001-7426-4795
-
tuw.event.name
16th International Conference on Cloud Computing (IEEE CLOUD 2023)
en
tuw.event.startdate
02-07-2023
-
tuw.event.enddate
08-07-2023
-
tuw.event.online
Hybrid
-
tuw.event.type
Event for scientific audience
-
tuw.event.place
Chicago
-
tuw.event.country
US
-
tuw.event.presenter
Morichetta, Andrea
-
wb.sciencebranch
Informatik
-
wb.sciencebranch.oefos
1020
-
wb.sciencebranch.value
100
-
item.fulltext
no Fulltext
-
item.grantfulltext
none
-
item.openairecristype
http://purl.org/coar/resource_type/c_5794
-
item.languageiso639-1
en
-
item.openairetype
conference paper
-
item.cerifentitytype
Publications
-
crisitem.author.dept
E194-02 - Forschungsbereich Distributed Systems
-
crisitem.author.dept
E194-02 - Forschungsbereich Distributed Systems
-
crisitem.author.dept
E194-02 - Forschungsbereich Distributed Systems
-
crisitem.author.dept
E194-02 - Forschungsbereich Distributed Systems
-
crisitem.author.dept
E194-02 - Forschungsbereich Distributed Systems
-
crisitem.author.dept
E194-02 - Forschungsbereich Distributed Systems
-
crisitem.author.dept
Futurewei Technologies, Inc., Santa Clara, CA, USA
-
crisitem.author.orcid
0000-0003-3765-3067
-
crisitem.author.orcid
0000-0001-9765-6310
-
crisitem.author.orcid
0000-0003-2830-8368
-
crisitem.author.orcid
0000-0003-3293-9437
-
crisitem.author.orcid
0000-0003-0410-6315
-
crisitem.author.orcid
0000-0001-6872-8821
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering