Learning to Schedule Multi-Server Jobs With Fluctuated Processing Speeds

Zhao, Hailiang; Deng, Shuiguang; Chen, Feiyi; Yin, Jianwei; Dustdar, Schahram; Zomaya, Albert Y.

doi:10.1109/TPDS.2022.3215947

DC Field

Value

Language

dc.contributor.author

Zhao, Hailiang

dc.contributor.author

Deng, Shuiguang

dc.contributor.author

Chen, Feiyi

dc.contributor.author

Yin, Jianwei

dc.contributor.author

Dustdar, Schahram

dc.contributor.author

Zomaya, Albert Y.

dc.date.accessioned

2023-01-24T16:18:05Z

dc.date.available

2023-01-24T16:18:05Z

dc.date.issued

2023

dc.identifier.citation

<div class="csl-bib-body"> <div class="csl-entry">Zhao, H., Deng, S., Chen, F., Yin, J., Dustdar, S., & Zomaya, A. Y. (2023). Learning to Schedule Multi-Server Jobs With Fluctuated Processing Speeds. <i>IEEE Transactions on Parallel and Distributed Systems</i>, <i>34</i>(1), 234–245. https://doi.org/10.1109/TPDS.2022.3215947</div> </div>

dc.identifier.issn

1045-9219

dc.identifier.uri

http://hdl.handle.net/20.500.12708/142173

dc.description.abstract

Multi-server jobs are imperative in modern cloud computing systems. A noteworthy feature of multi-server jobs is that, they usually request multiple computing devices simultaneously for their execution. How to schedule multi-server jobs online with a high system efficiency is a topic of great concern. First, the scheduling decisions have to satisfy the service locality constraints. Second, the scheduling decisions needs to be made online without the knowledge of future job arrivals. Third, and most importantly, the actual service rate experienced by a job is usually in fluctuation because of the dynamic voltage and frequency scaling (DVFS) and power oversubscription techniques when multiple types of jobs co-locate. A majority of online algorithms with theoretical performance guarantees are proposed. However, most of them require the processing speeds to be knowable, thereby the job completion times can be exactly calculated. To present a theoretically guaranteed online scheduling algorithm for multi-server jobs without knowing actual processing speeds apriori, in this article, we propose Esdp (Efficient Sampling-based Dynamic Programming), which learns the distribution of the fluctuated processing speeds over time and simultaneously seeks to maximize the cumulative overall utility. The cumulative overall utility is formulated as the sum of the utilities of successfully serving each multi-server job minus the penalty on the operating, maintaining, and energy cost. Esdp is proved to have a polynomial complexity and a logarithmic regret, which is a State-of-the-Art result. We also validate it with extensive simulations and the results show that the proposed algorithm outperforms several benchmark policies with improvements by up to 73%, 36%, and 28%, respectively.

dc.language.iso

dc.publisher

IEEE COMPUTER SOC

dc.relation.ispartof

IEEE Transactions on Parallel and Distributed Systems

dc.subject

Bipartite graph

dc.subject

dynamic programming

dc.subject

multi-server job

dc.subject

online learning

dc.subject

regret analysis

dc.title

Learning to Schedule Multi-Server Jobs With Fluctuated Processing Speeds

dc.type

Article

dc.type

Artikel

dc.identifier.scopus

2-s2.0-85140752407

dc.identifier.url

https://api.elsevier.com/content/abstract/scopus_id/85140752407

dc.contributor.affiliation

Zhejiang University, China

dc.contributor.affiliation

Zhejiang University, China

dc.contributor.affiliation

Zhejiang University, China

dc.contributor.affiliation

Zhejiang University, China

dc.contributor.affiliation

University of Sydney, Australia

dc.description.startpage

234

dc.description.endpage

245

dcterms.dateSubmitted

2022-04-08

dc.type.category

Original Research Article

tuw.container.volume

tuw.container.issue

tuw.journal.peerreviewed

true

tuw.peerreviewed

true

wb.publication.intCoWork

International Co-publication

tuw.researchTopic.id

I4a

tuw.researchTopic.name

Information Systems Engineering

tuw.researchTopic.value

100

dcterms.isPartOf.title

IEEE Transactions on Parallel and Distributed Systems

tuw.publication.orgunit

E194-02 - Forschungsbereich Distributed Systems

tuw.publisher.doi

10.1109/TPDS.2022.3215947

dc.date.onlinefirst

2022-10-20

dc.identifier.eissn

1558-2183

dc.description.numberOfPages

tuw.author.orcid

0000-0003-2850-6815

tuw.author.orcid

0000-0001-5015-6095

tuw.author.orcid

0000-0001-6872-8821

tuw.author.orcid

0000-0002-3090-1059

dc.description.sponsorshipexternal

National Science Foundation of China (NSFC)

dc.description.sponsorshipexternal

National Science Foundation of China (NSFC)

dc.description.sponsorshipexternal

Key Research Project of Zhejiang Province

dc.description.sponsorshipexternal

Zhejiang University Deqing Institute of Advanced technology and Industrilization (ZDATI)

dc.relation.grantnoexternal

Grant U20A20173

dc.relation.grantnoexternal

Grant 62125206

dc.relation.grantnoexternal

Grant 2022C01145

wb.sci

true

wb.sciencebranch

Informatik

wb.sciencebranch.oefos

1020

wb.sciencebranch.value

100

item.grantfulltext

none

item.openairecristype

http://purl.org/coar/resource_type/c_2df8fbb1

item.openairetype

research article

item.languageiso639-1

item.cerifentitytype

Publications

item.fulltext

no Fulltext

crisitem.author.dept

Zhejiang University

crisitem.author.dept

Zhejiang University

crisitem.author.dept

Zhejiang University

crisitem.author.dept

Zhejiang University

crisitem.author.dept

E194-02 - Forschungsbereich Distributed Systems

crisitem.author.dept

University of Sydney

crisitem.author.orcid

0000-0003-2850-6815

crisitem.author.orcid

0000-0001-5015-6095

crisitem.author.orcid

0000-0001-6872-8821

crisitem.author.orcid

0000-0002-3090-1059

crisitem.author.parentorg

E194 - Institut für Information Systems Engineering

Appears in Collections:

Article

Show simple item record

Page view(s)

274

checked on Nov 21, 2023

Google Scholar^TM

Check

Page view(s)

Google ScholarTM

Google Scholar^TM