Efficient process mapping for cartesian topologies

Lehr, Markus

doi:10.34726/hss.2019.65323

Datensatz Zitierlink:

https://doi.org/10.34726/hss.2019.65323
http://hdl.handle.net/20.500.12708/1480

Titel:

Efficient process mapping for cartesian topologies

Zitat:

Lehr, M. (2019). Efficient process mapping for cartesian topologies [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2019.65323

reposiTUm-DOI:

10.34726/hss.2019.65323

CatalogPlus:

AC15563558

Publikationstyp:

Hochschulschrift - Diplomarbeit

Sprache:

Englisch

Autor_innen:

Lehr, Markus

Betreuer_in:

Träff, Jesper Larsson

Organisationseinheit:

E191 - Institut für Computer Engineering

Datum (veröffentlicht):

2019

Umfang:

Keywords:

Parallel, distributed memory computing; High-Performance computing; Process mapping; Communication patterns; Grids; Stencils; MPI

Abstract:

In this thesis we introduce different algorithms for mapping physical processes to Cartesian grids. We assume that processes within this grid communicate with certain neighboring processes as defined by a given stencil. We show that this mapping problem is already NP-complete for two dimensional grids and a very simple isomorphic neighborhood. We compare the current state of solutions in the field of High Performance Computing, specifically the Message Passing Interface (MPI). With his algorithm from 2018, W. D. Gropp showed promising performance, which is why we compare our approaches to his work, as well as MPI's standard behaviour. For qualitatively comparing concrete mappings, we define fitting optimality criteria, based on the core concept of minimizing inter-computation-node communication. We benchmarked using MPI_Irecv and MPI_Isend and show that our algorithms can find mappings, where Gropp cannot and that we could match or improve upon his resulting mappings in terms of quality and runtime. The first algorithm, which we present is similar to other graph partitioning approaches, since it utilizes recursive splitting in order to guarantee a logarithmic runtime wrt. the number of vertices in the Cartesian grid. We also guarantee a quality bound for this algorithm, which becomes better with increasing number of dimensions. In another implementation-variant, we adapted this algorithm for accommodating differently shaped stencils by weighting dimensions before the recursive splitting depending on how much communication happens across them. The third approach attempts to find hyperrectangular strips within the Cartesian grid, which are then filled similar to MPI's default row-major rank assignment. Although the theoretical bound of this approach is not as good as the first, this assignment strategy yielded the most compact mappings most of the time. Its main shortcoming, however, is its inability to adapt these strips, depending on different stencils.

Lizenz:

Urheberrechtsschutz

Enthalten in den Sammlungen:

Thesis