Forschungsbereich Parallel Computing

Organization Name (de) Name der Organisation (de)
E191-04 - Forschungsbereich Parallel Computing
 
Code Kennzahl
E191-04
 
Type of Organization Organisationstyp
Research Division
Parent OrgUnit Übergeordnete Organisation
 
Active Aktiv
 


Results 1-100 of 228 (Search time: 0.001 seconds).

PreviewAuthors / EditorsTitleTypeIssue Date
1Laso Rodriguez, Ruben ; Salimi Beni, Majid ; Vardas, Ioannis ; Benkner, Siegfried ; Hunold, Sascha To ncclsee, or Not to ncclsee: That is the Profiling QuestionInproceedings Konferenzbeitrag10-Apr-2026
2Träff, Jesper Larsson Lectures on Parallel ComputingBook Buch2026
3Salimi Beni, Majid ; Laso, Ruben ; Cosenza, Biagio ; Benkner, Siegfried ; Hunold, Sascha Exploring NCCL Tuning Strategies for Distributed Deep LearningInproceedings Konferenzbeitrag 13-Aug-2025
4Vardas-2025-CONCURRENCY AND COMPUTATION-PRACTICE  EXPERIENCE-vor.pdf.jpgVardas, Ioannis ; Träff, Jesper Larsson ; Laso, Ruben ; Hunold, Sascha Mpisee: communicator-centric profiling of MPI applicationsArticle Artikel 25-Jul-2025
5Träff, Jesper Larsson Communication Round and Computation Efficient Exclusive Prefix-Sums Algorithms (for MPI_Exscan)Preprint Preprint 7-Jul-2025
6Salimi Beni-2025-Optimizing Distributed Deep Learning Training by Tuning ...-vor.pdf.jpgSalimi Beni, Majid ; Laso, Ruben ; Cosenza, Biagio ; Benkner, Siegfried ; Hunold, Sascha Optimizing Distributed Deep Learning Training by Tuning NCCLInproceedings Konferenzbeitrag 22-May-2025
7Vardas-2025-ncclsee A Lightweight Profiling Tool for NCCL-vor.pdf.jpgVardas, Ioannis ; Laso Rodriguez, Ruben ; Salimi Beni, Majid ncclsee: A Lightweight Profiling Tool for NCCLInproceedings Konferenzbeitrag 22-May-2025
8Traeff-2024-Optimal, Non-pipelined Reduce-scatter and Allreduce Algorithms-smur.pdf.jpgTräff, Jesper Larsson Optimal, Non-pipelined Reduce-scatter and Allreduce AlgorithmsPreprint Preprint 13-Feb-2025
9Carpentieri, Lorenzo ; De Caro, Antonio ; Salimibeni, Majid ; Fan, Kaijie ; Cosenza, Biagio Phase-Based Frequency Scaling for Energy-Efficient Heterogeneous ComputingInproceedings Konferenzbeitrag 2025
10Salimibeni, Majid ; Cosenza, Biagio ; Hunold, Sascha MPI Collective Algorithm Selection in the Presence of Process Arrival PatternsInproceedings Konferenzbeitrag 7-Nov-2024
11Vardas, Ioannis ; Hunold, Sascha ; Swartvagher, Philippe ; Träff, Jesper Larsson Exploring Mapping Strategies for Co-allocated HPC ApplicationsInproceedings Konferenzbeitrag 4-Nov-2024
12Traeff-2024-Lectures on Parallel Computing-smur.pdf.jpgTräff, Jesper Larsson Lectures on Parallel ComputingPreprint Preprint 26-Jul-2024
13Träff, Jesper Larsson Optimal Broadcast Schedules in Logarithmic Time with Applications to Broadcast, All-Broadcast, Reduction and All-ReductionPreprint Preprint 26-Jul-2024
14Laso Rodriguez-2024-pSTL-Bench A Micro-Benchmark Suite for Assessing Scal...-vor.pdf.jpgLaso Rodriguez, Ruben ; Krupitza, Diego ; Hunold, Sascha pSTL-Bench: A Micro-Benchmark Suite for Assessing Scalability of C++ Parallel STL ImplementationsPreprint Preprint 9-Feb-2024
15Salimi Beni-2024-Journal of Supercomputing-vor.pdf.jpgSalimi Beni, Majid ; Hunold, Sascha ; Cosenza, Biagio Analysis and prediction of performance variability in large-scale computing systemsArticle Artikel 2024
16Vardas, Ioannis ; Hunold, Sascha ; SWARTVAGHER, Philippe ; Träff, Jesper Larsson Improved Parallel Application Performance and Makespan by Colocation and Topology-aware Process MappingInproceedings Konferenzbeitrag 2024
17Hunold, Sascha ; Xie, Biwei ; Shu, Kai Benchmarking, Measuring, and Optimizing : 15th BenchCouncil International Symposium, Bench 2023, Revised Selected PapersProceedings Tagungsband2024
18Laso Rodriguez, Ruben ; Krupitza, Diego ; Hunold, Sascha Exploring Scalability in C++ Parallel STL ImplementationsInproceedings Konferenzbeitrag 2024
19Traeff-2023-Round-optimal n-Block Broadcast Schedules in Logarithmic Time-smur.pdf.jpgTräff, Jesper Larsson Round-optimal 𝑛-Block Broadcast Schedules in Logarithmic TimePreprint Preprint 18-Dec-2023
20Hunold, Sascha Unveiling the Complexities of Performance Analysis and Optimization in HPC SystemsPresentation Vortrag8-Dec-2023
21Swartvagher-2023-Using Mixed-Radix Decomposition to Enumerate Computation...-vor.pdf.jpgSwartvagher, Philippe ; Hunold, Sascha ; Träff, Jesper Larsson ; Vardas, Ioannis Using Mixed-Radix Decomposition to Enumerate Computational Resources of Deeply Hierarchical ArchitecturesInproceedings Konferenzbeitrag 12-Nov-2023
22Hunold-2023-Verifying Performance Guidelines for MPI Collectives at Scale-vor.pdf.jpgHunold, Sascha Verifying Performance Guidelines for MPI Collectives at ScaleInproceedings Konferenzbeitrag 12-Nov-2023
23Laso Rodriguez, Ruben ; Casado, Fernando E. The research career after the PhDPresentation Vortrag3-Nov-2023
24Traeff-2023-Library Development with MPI Attributes, Request Objects, Gro...-vor.pdf.jpgTräff, Jesper Larsson ; Vardas, Ioannis Library Development with MPI: Attributes, Request Objects, Group Communicator Creation, Local Reductions, and DatatypesInproceedings Konferenzbeitrag 21-Sep-2023
25Schuchart, Joseph ; Hunold, Sascha ; Bosilca, George Synchronizing MPI Processes in Space and TimeInproceedings KonferenzbeitragSep-2023
26Forsell, Martti ; Roivainen, Jussi ; Leppänen, Ville ; Träff, Jesper Larsson Realizing multioperations and multiprefixes in Thick Control Flow processorsArticle Artikel Apr-2023
27Träff, Jesper Larsson ; Hunold, Sascha ; Vardas, Ioannis ; Funk, Nikolaus Manes Uniform Algorithms for Reduce-scatter and (most) other Collectives for MPIInproceedings Konferenzbeitrag 2023
28Hunold, Sascha ; Steiner, Sebastian OMPICollTune: Autotuning MPI Collectives by Incremental Online LearningInproceedings Konferenzbeitrag 2023
29Hunold-2023-Massively Scaling Molecular Screening Workloads on EuroHPC Su...-vor.pdf.jpgHunold, Sascha ; Vardas, Ioannis ; Ibis, Gökhan ; Langer, Thierry Massively Scaling Molecular Screening Workloads on EuroHPC SupercomputersInproceedings Konferenzbeitrag 2023
30Vardas-2023-Effects of Mapping Strategies on Average Duration and Through...-vor.pdf.jpgVardas, Ioannis ; Hunold, Sascha ; Swartvagher, Philippe ; Träff, Jesper Larsson Effects of Mapping Strategies on Average Duration and Throughput of Colocated HPC ApplicationsInproceedings Konferenzbeitrag 2023
31Forsell, Martti ; Roivainen, Jussi ; Leppänen, Ville ; Träff, Jesper Larsson Preliminary Performance and Memory Access Scalability Study of Thick Control Flow ProcessorsInproceedings Konferenzbeitrag2023
32Hunold-2023-MPI is Good, Control is Better Checking Performance Guideline...-vor.pdf.jpgHunold, Sascha ; Hagn, Maximilian MPI is Good, Control is Better: Checking Performance Guidelines of CollectivesInproceedings Konferenzbeitrag 2023
33Hunold, Sascha ; Kraßnitzer, Klaus Dieter Vincenz A Quantitative Analysis of OpenMP Task Runtime SystemsInproceedings Konferenzbeitrag2023
34Swartvagher-2023-Rank Reordering within MPI Communicators to Exploit Deep...-vor.pdf.jpgSwartvagher, Philippe ; Vardas, Ioannis ; Hunold, Sascha ; Träff, Jesper Larsson Rank Reordering within MPI Communicators to Exploit Deep Hierarchal Architectures of SupercomputersInproceedings Konferenzbeitrag 2023
35Träff, Jesper Larsson Brief Announcement: Fast(er) Construction of Round-optimal n-Block Broadcast SchedulesInproceedings Konferenzbeitrag11-Jul-2022
36Hunold-2022-An Overhead Analysis of MPI Profiling and Tracing Tools-vor.pdf.jpgHunold, Sascha ; Ajanohoun, Jordy Innocentius ; Vardas, Ioannis ; Träff, Jesper Larsson An Overhead Analysis of MPI Profiling and Tracing ToolsInproceedings Konferenzbeitrag 27-Jun-2022
37Hunold, Sascha ; Przybylski, Bartłomiej Scheduling.jl - Collaborative and Reproducible Scheduling Research with JuliaPresentation Vortrag18-May-2022
38Vardas, Ioannis ; Hunold, Sascha ; Ajanohoun, Jordy I. ; Traff, Jesper Larsson mpisee: MPI Profiling for Communication and Communicator StructureInproceedings Konferenzbeitrag 2022
39Hunold, Sascha Performance Tuning of MPI Collectives - Status Quo and Open ProblemsPresentation Vortrag2022
40Vardas, Ioannis ; Hunold, Sascha ; Ajanohoun, Jordy I. ; Träff, Jesper Larsson mpisee: MPI Profiling for Communication and Communicator StructureKonferenzbeitrag Inproceedings 2022
41Ajanohoun, Jordy I. ; Vardas, Ioannis ; Träff, Jesper Larsson ; Hunold, Sascha MPI Performance Tools under the Microscope: A Thorough Overhead AnalysisKonferenzbeitrag Inproceedings 2022
42Forsell, Martti ; Nikula, Sara ; Roivainen, Jussi ; Leppänen, Ville ; Träff, Jesper Larsson Performance and programmability comparison of the thick control flow architecture and current multicore processorsArtikel Article 2022
43Träff, Jesper Larsson Fast(er) Construction of Round-optimal n-Block Broadcast SchedulesInproceedings Konferenzbeitrag 2022
44Träff, Jesper Larsson (Poly)Logarithmic Time Construction of Round-optimal n-Block Broadcast Schedules for Broadcast and irregular Allgather in MPIPreprint Preprint2022
45Hunold, Sascha ; Przybylski, Bartlomiej Teaching Complex Scheduling AlgorithmsKonferenzbeitrag Inproceedings 2021
46Träff, Jesper Larsson ; Pöter, Manuel A more pragmatic implementation of the lock-free, ordered, linked listKonferenzbeitrag Inproceedings 2021
47Träff, Jesper Larsson ; Hunold, Sascha ; Mercier, Guillaume ; Holmes, Daniel J. MPI collective communication through a single set of interfaces: A case for orthogonalityArtikel Article 2021
48Hunold, Sascha ; Ajanohoun, Jordy I. ; Carpen-Amarie, Alexandra MicroBench Maker: Reproduce, Reuse, ImproveKonferenzbeitrag Inproceedings 2021
49Träff, Jesper Larsson A Doubly-pipelined, Dual-root Reduction-to-all Algorithm and ImplementationPreprint Preprint2021
50Träff, Jesper Larsson Decomposing MPI Collectives for Exploiting Multi-lane CommunicationPräsentation Presentation2020
51Kirchbach, Konrad Von ; Schulz, Christian ; Träff, Jesper Larsson Better Process Mapping and Sparse Quadratic AssignmentArtikel Article 2020
52Hunold, Sascha ; Bhatele, Abhinav ; Bosilca, George ; Knees, Peter Predicting MPI Collective Communication Performance Using Machine LearningKonferenzbeitrag Inproceedings 2020
53Träff, Jesper Larsson ; Hunold, Sascha Decomposing MPI Collectives for Exploiting Multi-lane CommunicationKonferenzbeitrag Inproceedings 2020
54Träff, Jesper Larsson Signature Datatypes for Type Correct Collective Operations, RevisitedKonferenzbeitrag Inproceedings 2020
55Faraj, Marcelo Fonseca ; van der Grinten, Alexander ; Meyerhenke, Henning ; Träff, Jesper Larsson ; Schulz, Christian High-Quality Hierarchical Process MappingKonferenzbeitrag Inproceedings 2020
56Hunold, Sascha ; Przybylski, Bartlomiej Scheduling.jl - Collaborative and Reproducible Scheduling Research with JuliaPreprint Preprint2020
57Forsell, Martti ; Roivainen, Jussi ; Träff, Jesper Larsson Optimizing Memory Access in TCF Processors with Compute-Update OperationsKonferenzbeitrag Inproceedings 2020
58von Kirchbach, Konrad ; Lehr, Markus ; Hunold, Sascha ; Schulz, Christian ; Träff, Jesper Larsson Efficient Process-to-Node Mapping Algorithms for Stencil ComputationsKonferenzbeitrag Inproceedings 2020
59Träff, Jesper Larsson ; Pöter, Manuel A more Pragmatic Implementation of the Lock-free, Ordered, Linked ListPreprint Preprint2020
60Faraj, Marcelo Fonseca ; van der Grinten, Alexander ; Meyerhenke, Henning ; Träff, Jesper Larsson ; Schulz, Christian High-Quality Hierarchical Process MappingPreprint Preprint2020
61Hunold, Sascha ; von Kirchbach, Konrad ; Lehr, Markus ; Schulz, Christian ; Träff, Jesper Larsson Efficient Process-to-Node Mapping Algorithms for Stencil ComputationsPreprint Preprint2020
62Träff, Jesper Larsson k-ported vs. k-lane Broadcast, Scatter, and Alltoall AlgorithmsPreprint Preprint2020
63Träff, Jesper Larsson ; Hoefler, Torsten Special issue: Selected papers from EuroMPI 2019Article Artikel2020
64Träff, Jesper Larsson Exploiting Multi-lane Communication in MPI CollectivesKonferenzbeitrag Inproceedings 2020
65Lehr, Markus ; von Kirchbach, Konrad Improved Cartesian Topology Mapping in MPIKonferenzbeitrag Inproceedings 2020
66Pachajoa, Carlos ; Levonyak, Markus ; Pacher, Christina ; Träff, Jesper Larsson ; Gansterer, Wilfried Classical and pipelined preconditioned conjugate gradient methods with node-failure resilienceKonferenzbeitrag Inproceedings 2020
67Träff, Jesper Larsson ; Hunold, Sascha ; Mercier, Guillaume ; Holmes, Daniel J. Collectives and Communicators: A Case for Orthogonality: (Or: How to get rid of MPI neighbor and enhance Cartesian collectives)Konferenzbeitrag Inproceedings 2020
68Träff, Jesper Larsson On Optimal Trees for Irregular Gather and Scatter CollectivesArtikel Article 1-Sep-2019
69Kang, Qiao ; Träff, Jesper Larsson ; Al-Bahrani, Reda ; Agrawal, Ankit ; Choudhary, Alok ; Liao, Wei-keng Scalable Algorithms for MPI Intergroup Allgather and AllgathervArtikel Article 2019
70Träff, Jesper Larsson On Optimal Trees for Irregular Gather and Scatter CollectivesPräsentation Presentation2019
71Träff, Jesper Larsson On optimal Trees for irregular gather and scatter collectives?Präsentation Presentation2019
72Träff, Jesper Larsson Cartesian Collective Communication: "Advice to users", "Advice to implementers", and "Advice to Standardizers"Präsentation Presentation2019
73Träff, Jesper Larsson On optimal Trees for irregular gather and scatter collectives?Präsentation Presentation2019
74Kainrad, Thomas ; Hunold, Sascha ; Seidel, Thomas ; Langer, Thierry LigandScout Remote: A New User-Friendly Interface for HPC and Cloud ResourcesArtikel Article 2019
75Hoefler, Torsten ; Träff, Jesper Larsson Proceedings of the 26th European MPI Users' Group Meeting, EuroMPI 2019Konferenzband Proceedings 2019
76Träff, Jesper Larsson ; Hunold, Sascha Cartesian Collective CommunicationKonferenzbeitrag Inproceedings 2019
77Hunold, Sascha ; Carpen-Amarie, Alexandra On the Importance of Data Quality when Tuning MPI LibrariesKonferenzbeitrag Inproceedings 2019
78Kainer, Michael ; Träff, Jesper Larsson More Parallelism in Dijkstra's Single-Source Shortest Path AlgorithmPreprint Preprint2019
79Träff, Jesper Larsson Decomposing Collectives for Exploiting Multi-lane CommunicationPreprint Preprint2019
80Pachajoa, Carlos ; Levonyak, Markus ; Gansterer, Wilfried N. ; Träff, Jesper Larsson How to Make the Preconditioned Conjugate Gradient Method Resilient Against Multiple Node FailuresKonferenzbeitrag Inproceedings 2019
81Pachajoa, Carlos ; Levonyak, Markus ; Gansterer, Wilfried ; Träff, Jesper Larsson How to Make the Preconditioned Conjugate Gradient Method Resilient Against Multiple Node FailuresPreprint Preprint2019
82Träff, Jesper Larsson ; Hoefler, Torsten Foreword EuroMPI 2019Konferenzbeitrag Inproceedings2019
83Forsell, Martti ; Roivainen, Jussi ; Leppänen, Ville ; Träff, Jesper Larsson Supporting concurrent memory access in TCF processor architecturesArtikel Article 2018
84Träff, Jesper Larsson Practical, distributed, low overhead algorithms for irregular gather and scatter collectivesArtikel Article 2018
85Pöter, Manuel ; Träff, Jesper Larsson <i>Stamp-it</i> , amortized constant-time memory reclamation in comparison to five other schemesKonferenzbeitrag Inproceedings 2018
86Pöter, Manuel ; Träff, Jesper Larsson Stamp-it: A more Thread-efficient, Concurrent Memory Reclamation Scheme in the C++ Memory ModelPreprint Preprint2018
87Träff, Jesper Larsson Parallel Quicksort without Pairwise Element ExchangePreprint Preprint2018
88Pöter, Manuel ; Träff, Jesper Larsson Memory Models for C/C++ ProgrammersPreprint Preprint2018
89Hunold, Sascha ; Carpen-Amarie, Alexandra Hierarchical Clock Synchronization in MPIKonferenzbeitrag Inproceedings 2018
90Kang, Qiao ; Träff, Jesper Larsson ; Al-Bahrani, Reda ; Agrawal, Ankit ; Choudhary, Alok ; Liao, Wei-keng Full-Duplex Inter-Group All-to-All Broadcast Algorithms with Optimal BandwidthKonferenzbeitrag Inproceedings 2018
91Hunold, Sascha ; Carpen-Amarie, Alexandra Algorithm Selection of MPI Collectives Using Machine Learning TechniquesKonferenzbeitrag Inproceedings 2018
92Hunold, Sascha ; Carpen-Amarie, Alexandra Autotuning MPI Collectives using Performance GuidelinesKonferenzbeitrag Inproceedings 2018
93Forsell, Martti ; Roivainen, Jussi ; Leppänen, Ville ; Träff, Jesper Larsson Implementation of Multioperations in Thick Control Flow ProcessorsKonferenzbeitrag Inproceedings 2018
94Träff, Jesper Larsson On Optimal trees for Irregular Gather and Scatter CollectivesPräsentation Presentation2018
95Pöter, Manuel ; Träff, Jesper Larsson Brief Announcement: Stamp-it, a more Thread-efficient, Concurrent Memory Reclamation Scheme in the C++ Memory ModelKonferenzbeitrag Inproceedings 2018
96Carpen-Amarie, Alexandra ; Hunold, Sascha ; Träff, Jesper Larsson On expected and observed communication performance with MPI derived datatypesArtikel Article 2017
97Träff, Jesper Larsson Fast Processing of MPI Derived Datatypes?Präsentation Presentation2017
98Bleuse, Raphael ; Hunold, Sascha ; Kedad-Sidhoum, Safia ; Monna, Florence ; Mounie, Gregory ; Trystram, Denis Scheduling Independent Moldable Tasks on Multi-Cores with GPUsArtikel Article 2017
99Lusk, Ewing ; Träff, Jesper Larsson MPI Is 25 Years Old!Artikel Article2017
100Träff, Jesper Larsson High Performance Expectations for MPIKonferenzbeitrag Inproceedings2017