<div class="csl-bib-body">
<div class="csl-entry">Zhao, H., Deng, S., Chen, F., Yin, J., Dustdar, S., & Zomaya, A. Y. (2023). Learning to Schedule Multi-Server Jobs With Fluctuated Processing Speeds. <i>IEEE Transactions on Parallel and Distributed Systems</i>, <i>34</i>(1), 234–245. https://doi.org/10.1109/TPDS.2022.3215947</div>
</div>
-
dc.identifier.issn
1045-9219
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/142173
-
dc.description.abstract
Multi-server jobs are imperative in modern cloud computing systems. A noteworthy feature of multi-server jobs is that, they usually request multiple computing devices simultaneously for their execution. How to schedule multi-server jobs online with a high system efficiency is a topic of great concern. First, the scheduling decisions have to satisfy the service locality constraints. Second, the scheduling decisions needs to be made online without the knowledge of future job arrivals. Third, and most importantly, the actual service rate experienced by a job is usually in fluctuation because of the dynamic voltage and frequency scaling (DVFS) and power oversubscription techniques when multiple types of jobs co-locate. A majority of online algorithms with theoretical performance guarantees are proposed. However, most of them require the processing speeds to be knowable, thereby the job completion times can be exactly calculated. To present a theoretically guaranteed online scheduling algorithm for multi-server jobs without knowing actual processing speeds apriori, in this article, we propose Esdp (Efficient Sampling-based Dynamic Programming), which learns the distribution of the fluctuated processing speeds over time and simultaneously seeks to maximize the cumulative overall utility. The cumulative overall utility is formulated as the sum of the utilities of successfully serving each multi-server job minus the penalty on the operating, maintaining, and energy cost. Esdp is proved to have a polynomial complexity and a logarithmic regret, which is a State-of-the-Art result. We also validate it with extensive simulations and the results show that the proposed algorithm outperforms several benchmark policies with improvements by up to 73%, 36%, and 28%, respectively.
en
dc.language.iso
en
-
dc.publisher
IEEE COMPUTER SOC
-
dc.relation.ispartof
IEEE Transactions on Parallel and Distributed Systems
-
dc.subject
Bipartite graph
en
dc.subject
dynamic programming
en
dc.subject
multi-server job
en
dc.subject
online learning
en
dc.subject
regret analysis
en
dc.title
Learning to Schedule Multi-Server Jobs With Fluctuated Processing Speeds