Hunold, S. (2023, December 8). Unveiling the Complexities of Performance Analysis and Optimization in HPC Systems [Presentation]. Universität Münster, Münster, Germany.
Although several parallel programming models exist, the Message
Passing Interface (MPI) remains the dominant choice for programming
supercomputers, with the majority of HPC applications built upon it.
A fundamental building block of MPI is the collective communication
operations, such as MPI_Bcast or MPI_Allreduce, which are essential
communication patterns among a group of processes. Since many scalable
applications require scalable MPI collectives, optimizing these
collectives is crucial for performance enhancement. However,
optimization can only be achieved with precise measurement of the
running time of collectives, which is a challenging task in large
distributed systems due to issues like unsynchronized distributed
clocks.
This talk will explore optimizing MPI collectives by accurately
measuring their running times, developing novel collective algorithms,
and devising methods to tackle the algorithm selection problem when a
collective is invoked in an MPI application.
en
Research Areas:
Computer Engineering and Software-Intensive Systems: 90% Computer Science Foundations: 10%