<div class="csl-bib-body">
<div class="csl-entry">Träff, J. L. (2025). <i>Communication Round and Computation Efficient Exclusive Prefix-Sums Algorithms (for MPI_Exscan)</i>. arXiv. https://doi.org/10.34726/10821</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/219401
-
dc.identifier.uri
https://doi.org/10.34726/10821
-
dc.description.abstract
Parallel scan primitives compute element-wise inclusive or exclusive prefix sums of input vectors contributed by p consecutively ranked processors under an associative, binary operator ⊕. In message-passing systems with bounded, one-ported communication capabilities, at least ⌈log2 p⌉ or ⌈log2(p − 1)⌉ communication rounds are required to perform the scans. While there are well-known, simple algorithms for the inclusive scan that solve the problem in ⌈log2 p⌉ communication rounds with ⌈log2 p⌉ applications of ⊕ (which could be expensive), the exclusive scan appears more difficult. Conventionally, the problem is solved with either ⌈log2(p − 1)⌉ + 1 communication rounds (e.g., by shifting the input vectors), or in ⌈log2 p⌉ communication rounds with 2⌈log2 p⌉ − 1 applications of ⊕ (by a modified inclusive scan algorithm). We give a new, simple algorithm that computes the exclusive prefix sums in q = ⌈log2(p − 1) + log2 4 3 ⌉ simultaneous send-receive communication rounds with q − 1 applications of ⊕. We compare the three algorithms implemented in MPI against the MPI library native MPI Exscan primitive on a small, 36-node cluster with a state-of-the-art MPI library, indicating possible and worthwhile improvements to standard implementations. The algorithms assume input vectors to be small so that performance is dominated by the num- ber of communication rounds. For large input vectors, other (pipelined, fixed-degree tree) algorithms must be used.
en
dc.language.iso
en
-
dc.rights.uri
http://creativecommons.org/licenses/by/4.0/
-
dc.subject
MPI
en
dc.subject
MPI (Message-Passing Interface)
en
dc.subject
Prefix-Sums Algorithms
en
dc.subject
algorithms
en
dc.title
Communication Round and Computation Efficient Exclusive Prefix-Sums Algorithms (for MPI_Exscan)
en
dc.type
Preprint
en
dc.type
Preprint
de
dc.rights.license
Creative Commons Namensnennung 4.0 International
de
dc.rights.license
Creative Commons Attribution 4.0 International
en
dc.identifier.doi
10.34726/10821
-
dc.identifier.arxiv
2507.04785
-
tuw.researchTopic.id
I1
-
tuw.researchTopic.id
I2
-
tuw.researchTopic.id
C5
-
tuw.researchTopic.name
Logic and Computation
-
tuw.researchTopic.name
Computer Engineering and Software-Intensive Systems