<div class="csl-bib-body">
<div class="csl-entry">Träff, J. L. (2024). <i>Optimal Broadcast Schedules in Logarithmic Time with Applications to Broadcast, All-Broadcast, Reduction and All-Reduction</i>. arXiv. https://doi.org/10.34726/10820</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/219400
-
dc.identifier.uri
https://doi.org/10.34726/10820
-
dc.description.abstract
We give optimally fast O(log p) time (per processor) algorithms for computing round- optimal broadcast schedules for message-passing parallel computing systems. This affirmatively answers difficult questions posed in a SPAA 2022 BA and a CLUSTER 2022 paper. We observe that the computed schedules and circulant communication graph can likewise be used for reduction, all-broadcast and all-reduction as well, leading to new, round-optimal algorithms for these problems. These observations affirmatively answer open questions posed in a CLUSTER 2023 paper.
The problem is to broadcast n indivisible blocks of data from a given root processor to all other processors in a (subgraph of a) fully connected network of p processors with fully bidirectional, one-ported communication capabilities. In this model, n − 1 + ⌈log2 p⌉ communication rounds are required. Our new algorithms compute for each processor in the network receive and send schedules each of size ⌈log2 p⌉ that determine uniquely in O(1) time for each communication round the new block that the processor will receive, and the already received block it has to send. Schedule computations are done independently per processor without communication. The broadcast communication subgraph is an easily computable, directed, ⌈log2 p⌉-regular circulant graph also used elsewhere. We show how the schedule computations can be done in optimal time and space of O(log p), improving significantly over previous results of O(p log2 p) and O(log3 p), respectively. The schedule computation and broadcast algorithms are simple to implement, but correctness and complexity are not obvious. The schedules are used for new implementations of the MPI (Message-Passing Interface) collectives MPI Bcast, MPI Allgatherv,MPI Reduce and MPI Reduce scatter. Pre- liminary experimental results are given. Carefully engineered and extensively evaluated implementations will be presented elsewhere.
en
dc.language.iso
en
-
dc.rights.uri
http://creativecommons.org/licenses/by/4.0/
-
dc.subject
MPI
en
dc.subject
MPI (Message-Passing Interface)
en
dc.subject
Distributed Computing
en
dc.subject
Parallel Computing
en
dc.subject
Cluster Computing
en
dc.title
Optimal Broadcast Schedules in Logarithmic Time with Applications to Broadcast, All-Broadcast, Reduction and All-Reduction
en
dc.type
Preprint
en
dc.type
Preprint
de
dc.rights.license
Creative Commons Namensnennung 4.0 International
de
dc.rights.license
Creative Commons Attribution 4.0 International
en
dc.identifier.doi
10.34726/10820
-
tuw.researchTopic.id
I1
-
tuw.researchTopic.id
I2
-
tuw.researchTopic.id
C5
-
tuw.researchTopic.name
Logic and Computation
-
tuw.researchTopic.name
Computer Engineering and Software-Intensive Systems