Träff, J. L., Hunold, S., Vardas, I., & Funk, N. M. (2023). Uniform Algorithms for Reduce-scatter and (most) other Collectives for MPI. In 2023 IEEE International Conference on Cluster Computing (CLUSTER) (pp. 284–294). IEEE. https://doi.org/10.1109/CLUSTER52292.2023.00031
IEEE International Conference on Cluster Computing (IEEE CLUSTER 2023)
en
Event date:
31-Oct-2023 - 3-Nov-2023
-
Event place:
Santa Fe, New Mexico, United States of America (the)
-
Number of Pages:
11
-
Publisher:
IEEE, Piscataway
-
Peer reviewed:
Yes
-
Keywords:
MPI; HPC; collective communication operations
en
Abstract:
We explore the use of a regular, circulant graph communication pattern for the implementation of the reduction-to-all (MPI_Allreduce), by specialization the reduction-to-root (MPI_Reduce), the reduce-scatter (MPI_Reduce_scatter_block), the all-to-all-broadcast (MPI_Allgather) and the rooted gather and scatter (MPI_Gather and MPI_Scatter) collective operations, all as found in MPI (the Message-Passing Interface), for commutative operators and for any number of processes. The reduction-to-all algorithm reconstructs the little known algorithm by Bar-Noy, Kipnis and Schieber (1993), which the paper considerably extends.We experiment with extensions and combinations of the algorithms for these operations, and examine their performance from the perspective of performance guidelines, and in direct comparison to the implementations in common MPI libraries. On a small cluster with 36 × 32 cores and two larger HPC production systems, we show that we can especially for MPI_Reduce_scatter_block achieve considerably better performance than standard MPI library implementations. Our algorithms can perform consistently, which the implementations in standard MPI libraries sometimes do not.In a homogeneous, one-ported communication system with linear transmission costs, reduction-to-all, reduce-scatter and all-to-all-broadcast can all be implemented in O(log p + m) time steps for problems of size m with small constants which we analyze and discuss.
en
Research Areas:
Computer Engineering and Software-Intensive Systems: 90% Computer Science Foundations: 10%