<div class="csl-bib-body">
<div class="csl-entry">Laso Rodriguez, R., Krupitza, D., & Hunold, S. (2024). Exploring Scalability in C++ Parallel STL Implementations. In <i>ICPP ’24: Proceedings of the 53rd International Conference on Parallel Processing</i> (pp. 284–293). ACM. https://doi.org/10.1145/3673038.3673065</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/204481
-
dc.description.abstract
Since the advent of parallel algorithms in the C++17 Standard Template Library (STL), the STL has become a viable framework for creating performance-portable applications. Given multiple existing implementations of the parallel algorithms, a systematic, quantitative performance comparison is essential for choosing the appropriate implementation for a particular hardware configuration. In this work, we introduce a specialized set of micro-benchmarks to assess the scalability of the parallel algorithms in the STL. By selecting different backends, our micro-benchmarks can be used on multi-core systems and GPUs. Using the suite, in a case study on AMD and Intel CPUs and NVIDIA GPUs, we were able to identify substantial performance disparities among different implementations, including GCC+TBB, GCC+HPX, Intel's compiler with TBB, or NVIDIA's compiler with OpenMP and CUDA.
en
dc.description.sponsorship
FWF - Österr. Wissenschaftsfonds
-
dc.language.iso
en
-
dc.rights.uri
http://creativecommons.org/licenses/by-nc-sa/4.0/
-
dc.subject
C++
en
dc.subject
CUDA
en
dc.subject
OpenMP
en
dc.subject
Performance Portability
en
dc.subject
Standard Template Library
en
dc.subject
Threading Building Blocks
en
dc.title
Exploring Scalability in C++ Parallel STL Implementations
en
dc.type
Inproceedings
en
dc.type
Konferenzbeitrag
de
dc.rights.license
Creative Commons Namensnennung - Nicht-kommerziell - Weitergabe unter gleichen Bedingungen 4.0 International
de
dc.rights.license
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International