Hunold, S. (2022). Performance Tuning of MPI Collectives - Status Quo and Open Problems [Presentation]. CaSToRC HPC National Competence Center Fall Seminar Series 2022, Unknown. http://hdl.handle.net/20.500.12708/153709
CaSToRC HPC National Competence Center Fall Seminar Series 2022
en
Event date:
29-Nov-2022
-
Event place:
Unknown
-
Keywords:
MPI Collectives
en
Abstract:
MPI collective operations such as MPI_Allreduce are fundamental basic blocks of large-scale applications in High Performance Computing. Since the MPI standard only defines the semantics of MPI communication operations, MPI implementations (e.g., Open MPI or MPICH) are free to implement the collective operations the best way possible. For important collective operations, e.g. MPI_Allreduce and MPI_Bcast, MPI libraries provide several algorithms for each operation.
In this talk, we investigate the problem of tuning MPI collective operations on a given supercomputer, i.e., selecting the best algorithm for a communication problem. For example, we would like to answer the following question: what is the fastest algorithm to execute an MPI_Bcast with 100 Bytes of data using 16 compute nodes and 32 processes per compute node. We also need to discuss the accuracy of methods that support the analysis of MPI applications, such as profiling or tracing. In addition, we show that basic methods from statistics and machine learning can help us to find efficient algorithms for collective operations in a practical setting.
en
Project title:
Offline- und Online-Autotuning von Parallelen Programmen: P33884-N (Fonds zur Förderung der wissenschaftlichen Forschung (FWF))
-
Research Areas:
Computer Engineering and Software-Intensive Systems: 90% Computer Science Foundations: 10%