<div class="csl-bib-body">
<div class="csl-entry">Träff, J. L. (2025). <i>Optimal, Non-pipelined Reduce-scatter and Allreduce Algorithms</i>. arXiv. https://doi.org/10.34726/10760</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/219280
-
dc.identifier.uri
https://doi.org/10.34726/10760
-
dc.description.abstract
The reduce-scatter collective operation in which p processors in a network of processors collectively reduce p input vectors into a result vector that is partitioned over the processors is important both in its own right and as building block for other collective operations. We present a surprisingly simple, but non-trivial algorithm for solving this problem optimally in \lceil\log_2 p\rceil communication rounds with each processor sending, receiving and reducing exactly p-1 blocks of vector elements. We combine this with a similarly simple, well-known allgather algorithm to get a volume optimal algorithm for the allreduce collective operation where the result vector is replicated on all processors. The communication pattern is a simple, \lceil\log_2 p\rceil-regular, circulant graph also used elsewhere. The algorithms assume the binary reduction operator to be commutative and we discuss this assumption. The algorithms can readily be implemented and used for the collective operations MPI_Reduce_scatter_block, MPI_Reduce_scatter and MPI_Allreduce as specified in the MPI standard. We also observe that the reduce-scatter algorithm can be used as a template for round-optimal all-to-all communication and the collective MPI_Alltoall operation.
en
dc.language.iso
en
-
dc.rights.uri
http://rightsstatements.org/vocab/InC/1.0/
-
dc.subject
MPI
en
dc.subject
MPI (Message-Passing Interface)
en
dc.subject
algorithms
en
dc.subject
Parallel Computing
en
dc.subject
reduce scatter
en
dc.subject
all-reduce algorithms
en
dc.title
Optimal, Non-pipelined Reduce-scatter and Allreduce Algorithms
en
dc.type
Preprint
en
dc.type
Preprint
de
dc.rights.license
Urheberrechtsschutz
de
dc.rights.license
In Copyright
en
dc.identifier.doi
10.34726/10760
-
dc.identifier.arxiv
2410.14234v4
-
tuw.researchTopic.id
I1
-
tuw.researchTopic.id
I2
-
tuw.researchTopic.id
C5
-
tuw.researchTopic.name
Logic and Computation
-
tuw.researchTopic.name
Computer Engineering and Software-Intensive Systems