Wissenschaftliche Artikel

Vardas, I., Träff, J. L., Laso, R., & Hunold, S. (2025). Mpisee: communicator-centric profiling of MPI applications. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 37(15–17), Article e70158. https://doi.org/10.1002/cpe.70158 ( reposiTUm)
Forsell, M., Roivainen, J., Leppänen, V., & Träff, J. L. (2023). Realizing multioperations and multiprefixes in Thick Control Flow processors. Microprocessors and Microsystems, 98, Article 104807. https://doi.org/10.1016/j.micpro.2023.104807 ( reposiTUm)
Forsell, M., Nikula, S., Roivainen, J., Leppänen, V., & Träff, J. L. (2022). Performance and programmability comparison of the thick control flow architecture and current multicore processors. The Journal of Supercomputing, 78(3), 3152–3183. https://doi.org/10.1007/s11227-021-03985-0 ( reposiTUm)
Träff, J. L., Hunold, S., Mercier, G., & Holmes, D. J. (2021). MPI collective communication through a single set of interfaces: A case for orthogonality. Parallel Computing: Systems & Applications, 107(102826), 102826. https://doi.org/10.1016/j.parco.2021.102826 ( reposiTUm)
Kirchbach, K. V., Schulz, C., & Träff, J. L. (2020). Better Process Mapping and Sparse Quadratic Assignment. ACM Journal on Experimental Algorithmics, 25, 1–19. https://doi.org/10.1145/3409667 ( reposiTUm)
Träff, J. L., & Hoefler, T. (2020). Special issue: Selected papers from EuroMPI 2019. Parallel Computing, 99, Article 102695. https://doi.org/10.1016/j.parco.2020.102695 ( reposiTUm)
Kang, Q., Träff, J. L., Al-Bahrani, R., Agrawal, A., Choudhary, A., & Liao, W. (2019). Scalable Algorithms for MPI Intergroup Allgather and Allgatherv. Parallel Computing: Systems & Applications, 85, 220–230. https://doi.org/10.1016/j.parco.2019.04.015 ( reposiTUm)
Träff, J. L. (2019). On Optimal Trees for Irregular Gather and Scatter Collectives. IEEE Transactions on Parallel and Distributed Systems, 30(9), 2060–2074. https://doi.org/10.1109/tpds.2019.2899843 ( reposiTUm)
Forsell, M., Roivainen, J., Leppänen, V., & Träff, J. L. (2018). Supporting concurrent memory access in TCF processor architectures. Microprocessors and Microsystems, 63, 226–236. https://doi.org/10.1016/j.micpro.2018.09.013 ( reposiTUm)
Träff, J. L. (2018). Practical, distributed, low overhead algorithms for irregular gather and scatter collectives. Parallel Computing: Systems & Applications, 75, 100–117. https://doi.org/10.1016/j.parco.2018.04.003 ( reposiTUm)
Carpen-Amarie, A., Hunold, S., & Träff, J. L. (2017). On expected and observed communication performance with MPI derived datatypes. Parallel Computing: Systems & Applications, 69, 98–117. https://doi.org/10.1016/j.parco.2017.08.006 ( reposiTUm)
Lusk, E., & Träff, J. L. (2017). MPI Is 25 Years Old! HPCwire, MAY 1. http://hdl.handle.net/20.500.12708/146783 ( reposiTUm)
Lengauer, C., Bougé, L., & Träff, J. L. (2016). Editorial: Special Issue: Euro-Par 2015. Concurrency and Computation: Practice and Experience, 28(12), 3445–3446. http://hdl.handle.net/20.500.12708/148865 ( reposiTUm)
Träff, J. L. (2016). Viewpoint: (Mis)Managing Parallel Computing Research through EU Project Funding. Communications of the ACM, 59(12), 46–48. https://doi.org/10.1145/2948893 ( reposiTUm)
Siebert, C., & Träff, J. L. (2014). Perfectly Load-Balanced, Stable, Synchronization-Free Parallel Merge. Parallel Processing Letters, 24(01), 1450005. https://doi.org/10.1142/s0129626414500054 ( reposiTUm)
Träff, J. L., & Benkner, S. (2014). Preface: Selected Papers from EuroMPI 2012. Computing, 96(4), 259–261. https://doi.org/10.1007/s00607-013-0335-z ( reposiTUm)
Träff, J. L. (2012). Alternative, uniformly expressive and more scalable interfaces for collective communication in MPI. Parallel Computing: Systems & Applications, 38(1–2), 26–36. https://doi.org/10.1016/j.parco.2011.10.009 ( reposiTUm)
Benkner, S., Pllana, S., Träff, J. L., Tsigas, P., Dolinsky, U., Augonnet, C., Bachmayer, B., Kessler, C., Moloney, D., & Osipov, V. (2011). PEPPHER: Efficient and Productive Usage of Hybrid Computing Systems. IEEE Micro, 31(5), 28–41. https://doi.org/10.1109/mm.2011.67 ( reposiTUm)

Beiträge in Tagungsbänden

Vardas, I., Hunold, S., SWARTVAGHER, P., & Träff, J. L. (2024). Improved Parallel Application Performance and Makespan by Colocation and Topology-aware Process Mapping. In 2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing (CCGrid) (pp. 119–124). IEEE. https://doi.org/10.1109/CCGrid59990.2024.00023 ( reposiTUm)
Vardas, I., Hunold, S., Swartvagher, P., & Träff, J. L. (2024). Exploring Mapping Strategies for Co-allocated HPC Applications. In Demetris Zeinalipour, D. Blanco Heras, G. Pallis, H. Herodotou, D. Trihinas, D. Balouek, P. Diehl, T. Cojean, K. Fürlinger, M. H. Kirkeby, M. Nardelli, & P. Di Sanzo (Eds.), Euro-Par 2023: Parallel Processing Workshops : Euro-Par 2023 International Workshops, Limassol, Cyprus, August 28 – September 1, 2023, Revised Selected Papers, Part II (pp. 271–276). Springer Nature. https://doi.org/10.1007/978-3-031-48803-0_41 ( reposiTUm)
Swartvagher, P., Hunold, S., Träff, J. L., & Vardas, I. (2023). Using Mixed-Radix Decomposition to Enumerate Computational Resources of Deeply Hierarchical Architectures. In Proceedings of 2023 SC23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis (SC 2023 Workshops) (pp. 405–415). ACM. https://doi.org/10.1145/3624062.3624109 ( reposiTUm)
Träff, J. L., & Vardas, I. (2023). Library Development with MPI: Attributes, Request Objects, Group Communicator Creation, Local Reductions, and Datatypes. In Proceedings of the 30th European MPI Users’ Group Meeting (EUROMPI 23). 30th European MPI Users’ Group Meeting (EuroMPI 2023), Bristol, United Kingdom of Great Britain and Northern Ireland (the). ACM. https://doi.org/10.1145/3615318.3615323 ( reposiTUm)
Träff, J. L., Hunold, S., Vardas, I., & Funk, N. M. (2023). Uniform Algorithms for Reduce-scatter and (most) other Collectives for MPI. In 2023 IEEE International Conference on Cluster Computing (CLUSTER) (pp. 284–294). IEEE. https://doi.org/10.1109/CLUSTER52292.2023.00031 ( reposiTUm)
Swartvagher, P., Vardas, I., Hunold, S., & Träff, J. L. (2023). Rank Reordering within MPI Communicators to Exploit Deep Hierarchal Architectures of Supercomputers. In E. Reiter (Ed.), Austrian-Slovenian HPC Meeting 2023 - ASHPC23 (pp. 61–61). EuroCC Austria. https://doi.org/10.34726/5368 ( reposiTUm)
Vardas, I., Hunold, S., Swartvagher, P., & Träff, J. L. (2023). Effects of Mapping Strategies on Average Duration and Throughput of Colocated HPC Applications. In E. Reiter (Ed.), Austrian-Slovenian HPC Meeting 2023 - ASHPC23 (pp. 10–10). EuroCC Austria. https://doi.org/10.34726/5330 ( reposiTUm)
Forsell, M., Roivainen, J., Leppänen, V., & Träff, J. L. (2023). Preliminary Performance and Memory Access Scalability Study of Thick Control Flow Processors. In J. Nurmi, M. Shen, P. Ellervee, P. Koch, & F. Moradi (Eds.), Proceedings 2023 IEEE Nordic Circuits and Systems Conference (NorCAS) (pp. 1–7). IEEE. https://doi.org/10.1109/NorCAS58970.2023.10305463 ( reposiTUm)
Vardas, I., Hunold, S., Ajanohoun, J. I., & Traff, J. L. (2022). mpisee: MPI Profiling for Communication and Communicator Structure. In 2022 IEEE 36th International Parallel and Distributed Processing Symposium Workshops (IPDPSW 2022) (pp. 520–529). IEEE. https://doi.org/10.1109/IPDPSW55747.2022.00092 ( reposiTUm)
Vardas, I., Hunold, S., Ajanohoun, J. I., & Träff, J. L. (2022). mpisee: MPI Profiling for Communication and Communicator Structure. In E. Reiter (Ed.), Austrian-Slovenian HPC Meeting 2022 - ASHPC22 (p. 15). EuroCC Austria. http://hdl.handle.net/20.500.12708/55696 ( reposiTUm)
Ajanohoun, J. I., Vardas, I., Träff, J. L., & Hunold, S. (2022). MPI Performance Tools under the Microscope: A Thorough Overhead Analysis. In E. Reiter (Ed.), Austrian-Slovenian HPC Meeting 2022 - ASHPC22 (p. 16). EuroCC Austria. http://hdl.handle.net/20.500.12708/55697 ( reposiTUm)
Träff, J. L. (2022). Fast(er) Construction of Round-optimal n-Block Broadcast Schedules. In Proceedings IEEE International Conference on Cluster Computing (CLUSTER 2022) (pp. 142–151). IEEE. https://doi.org/10.1109/CLUSTER51413.2022.00028 ( reposiTUm)
Träff, J. L. (2022). Brief Announcement: Fast(er) Construction of Round-optimal n-Block Broadcast Schedules. In K. Agrawal & I.-T. A. Lee (Eds.), Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2022) (pp. 143–146). ACM. https://doi.org/10.1145/3490148.3538560 ( reposiTUm)
Hunold, S., Ajanohoun, J. I., Vardas, I., & Träff, J. L. (2022). An Overhead Analysis of MPI Profiling and Tracing Tools. In C. Scully-Allison, R. Liem, & A. V. Solorzano (Eds.), PERMAVOST 2022: Proceedings of the 2nd Workshop on Performance Engineering, Modelling, Analysis, and Visualization Strategy (pp. 5–13). Association for Computing Machinery (ACM). https://doi.org/10.1145/3526063.3535353 ( reposiTUm)
Träff, J. L., & Pöter, M. (2021). A more pragmatic implementation of the lock-free, ordered, linked list. In J. Lee & E. Petrank (Eds.), Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM. https://doi.org/10.1145/3437801.3441579 ( reposiTUm)
Träff, J. L. (2020). Exploiting Multi-lane Communication in MPI Collectives. In A. Schlögl, J. Kiss, & S. Elefante (Eds.), Austrian High-Performance-Computing Meeting (AHPC 2020) (p. 30). IST Austria. https://doi.org/10.15479/AT:ISTA:7474 ( reposiTUm)
Träff, J. L., & Hunold, S. (2020). Decomposing MPI Collectives for Exploiting Multi-lane Communication. In 2020 IEEE International Conference on Cluster Computing (CLUSTER). IEEE International Conference on Cluster Computing (IEEE Cluster 2020) - Online Conference, Kobe, Japan. IEEE. https://doi.org/10.1109/cluster49012.2020.00037 ( reposiTUm)
Träff, J. L., Hunold, S., Mercier, G., & Holmes, D. J. (2020). Collectives and Communicators: A Case for Orthogonality. In 27th European MPI Users’ Group Meeting. 27th European MPI Users’ Group Meeting (EuroMPI/USA 2020) - Online Conference, Austin, United States of America (the). IEEE. https://doi.org/10.1145/3416315.3416319 ( reposiTUm)
Träff, J. L. (2020). Signature Datatypes for Type Correct Collective Operations, Revisited. In 27th European MPI Users’ Group Meeting. 27th European MPI Users’ Group Meeting (EuroMPI/USA 2020) - Online Conference, Austin, United States of America (the). IEEE. https://doi.org/10.1145/3416315.3416324 ( reposiTUm)
Faraj, M. F., van der Grinten, A., Meyerhenke, H., Träff, J. L., & Schulz, C. (2020). High-Quality Hierarchical Process Mapping. In S. Faro & D. Cantone (Eds.), 18th International Symposium on Experimental Algorithms, SEA 2020 (pp. 4:1-4:15). Schloss Dagstuhl - Leibniz-Zentrum für Informatik. https://doi.org/10.4230/LIPIcs.SEA.2020.4 ( reposiTUm)
Forsell, M., Roivainen, J., & Träff, J. L. (2020). Optimizing Memory Access in TCF Processors with Compute-Update Operations. In 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 22nd Workshop on Advances in Parallel and Distributed Computational Models (APDCM 2020) in conjunction with IPDPS 2020 - Online Conference, New Orleans, United States of America (the). IEEE. https://doi.org/10.1109/ipdpsw50202.2020.00100 ( reposiTUm)
von Kirchbach, K., Lehr, M., Hunold, S., Schulz, C., & Träff, J. L. (2020). Efficient Process-to-Node Mapping Algorithms for Stencil Computations. In 2020 IEEE International Conference on Cluster Computing (CLUSTER). IEEE International Conference on Cluster Computing (IEEE Cluster 2020) - Online Conference, Kobe, Japan. IEEE. https://doi.org/10.1109/cluster49012.2020.00011 ( reposiTUm)
Pachajoa, C., Levonyak, M., Pacher, C., Träff, J. L., & Gansterer, W. (2020). Classical and pipelined preconditioned conjugate gradient methods with node-failure resilience. In A. Schlögl, J. Kiss, & S. Elefante (Eds.), Austrian High-Performance-Computing Meeting (AHPC 2020) (p. 13). IST Austria. https://doi.org/10.15479/AT:ISTA:7474 ( reposiTUm)
Träff, J. L., & Hoefler, T. (2019). Foreword EuroMPI 2019. In T. Hoefler & J. L. Träff (Eds.), Proceedings of the 26th European MPI Users’ Group Meeting on  - EuroMPI ’19. ACM. https://doi.org/10.1145/3343211.3343212 ( reposiTUm)
Träff, J. L., & Hunold, S. (2019). Cartesian Collective Communication. In Proceedings of the 48th International Conference on Parallel Processing. 48th International Conference on Parallel Processing (ICPP 2019), Kyoto, Japan. ACM. https://doi.org/10.1145/3337821.3337848 ( reposiTUm)
Pachajoa, C., Levonyak, M., Gansterer, W. N., & Träff, J. L. (2019). How to Make the Preconditioned Conjugate Gradient Method Resilient Against Multiple Node Failures. In Proceedings of the 48th International Conference on Parallel Processing. 48th International Conference on Parallel Processing (ICPP 2019), Kyoto, Japan. ACM. https://doi.org/10.1145/3337821.3337849 ( reposiTUm)
Pöter, M., & Träff, J. L. (2018). Brief Announcement. In Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures. 30th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2018), Vienna, Austria, Austria. ACM. https://doi.org/10.1145/3210377.3210661 ( reposiTUm)
Pöter, M., & Träff, J. L. (2018). Stamp-it            , amortized constant-time memory reclamation in comparison to five other schemes. In Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 23rd Symposium on Principles and Practice of Parallel Programming (PPoPP 2018), Vienna, Austria, Austria. ACM. https://doi.org/10.1145/3178487.3178532 ( reposiTUm)
Kang, Q., Träff, J. L., Al-Bahrani, R., Agrawal, A., Choudhary, A., & Liao, W. (2018). Full-Duplex Inter-Group All-to-All Broadcast Algorithms with Optimal Bandwidth. In Proceedings of the 25th European MPI Users’ Group Meeting. 25th European MPI Users’ Group Meeting (EuroMPI 2018), Barcelona, Spain. ACM. https://doi.org/10.1145/3236367.3236374 ( reposiTUm)
Forsell, M., Roivainen, J., Leppänen, V., & Träff, J. L. (2018). Implementation of Multioperations in Thick Control Flow Processors. In 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 20th Workshop on Advances in Parallel and Distributed Computational Models (APDCM 2018) in conjunction with IPDPS 2018, Vancouver, Canada. IEEE. https://doi.org/10.1109/ipdpsw.2018.00121 ( reposiTUm)
Träff, J. L. (2017). High Performance Expectations for MPI. In G. Baumgartner & J. Courian (Eds.), AHPC 2017, Austrian HPC Meeting 2017 (p. 33). FSP Scientific Computing, University of Innsbruck. http://hdl.handle.net/20.500.12708/56920 ( reposiTUm)
Forsell, M., Roivainen, J., Leppänen, V., & Träff, J. L. (2017). Supporting concurrent memory access in TCF-aware processor architectures. In J. Nurmi, M. Vesterbacka, J. J. Wikner, A. Alvandpour, M. Nielsen-Lönn, & I. R. Nielsen (Eds.), 2017 IEEE Nordic Circuits and Systems Conference (NORCAS): NORCHIP and International Symposium of System-on-Chip (SoC). IEEE. https://doi.org/10.1109/norchip.2017.8124962 ( reposiTUm)
Schulz, C., & Träff, J. L. (2017). Better Process Mapping and Sparse Quadratic Assignment. In C. S. Iliopoulos, S. P. Pissis, S. J. Puglisi, & R. Raman (Eds.), 16th International Symposium on Experimental Algorithms, SEA 2017 (pp. 4:1-4:15). Schloss Dagstuhl - Leibniz-Zentrum für Informatik GmbH. https://doi.org/10.4230/LIPIcs.SEA.2017.4 ( reposiTUm)
Träff, J. L. (2017). Practical, linear-time, fully distributed algorithms for irregular gather and scatter. In Proceedings of the 24th European MPI Users’ Group Meeting on - EuroMPI ’17. 24th European MPI Users’ Group Meeting (EuroMPI/USA 2017), Chicago, IL, United States of America (the). ACM. https://doi.org/10.1145/3127024.3127025 ( reposiTUm)
Mirsadeghi, S. H., Träff, J. L., Balaji, P., & Afsahi, A. (2017). Exploiting Common Neighborhoods to Optimize MPI Neighborhood Collectives. In 2017 IEEE 24th International Conference on High Performance Computing (HiPC). 24th IEEE International Conference on High Performance Computing (HiPC 2017), Jaipur, India. IEEE. https://doi.org/10.1109/hipc.2017.00047 ( reposiTUm)
Markidis, S., Peng, I. B., Larsson Träff, J., Rougier, A., Bartsch, V., Machado, R., Rahn, M., Hart, A., Holmes, D., Bull, M., & Laure, E. (2016). The EPiGRAM Project: Preparing Parallel Programming Models for Exascale. In M. Taufer, B. Mohr, & J. M. Kunkel (Eds.), High Performance Computing : ISC High Performance 2016 International Workshops, ExaComm, E-MuCoCoS, HPC-IODC, IXPUG, IWOPH, P^3MA, VHPC, WOPSSS, Frankfurt, Germany, June 19–23, 2016, Revised Selected Papers (pp. 56–68). Springer International Publishing. https://doi.org/10.1007/978-3-319-46079-6_5 ( reposiTUm)
Hunold, S., Carpen-Amarie, A., & Träff, J. L. (2016). The art of benchmarking MPI libraries. In I. Reichl, C. Blaas-Schenner, & J. Zabloudil (Eds.), Austrian HPC Meeting 2016 - AHPC 2016 (p. 45). Vienna Scientific Cluster (VSC). http://hdl.handle.net/20.500.12708/56921 ( reposiTUm)
Hunold, S., Carpen-Amarie, A., Lübbe, F. D., & Träff, J. L. (2016). Automatic Verification of Self-consistent MPI Performance Guidelines. In P.-F. Dutot & D. Trystram (Eds.), Euro-Par 2016: Parallel Processing (pp. 433–446). Springer International Publishing. https://doi.org/10.1007/978-3-319-43659-3_32 ( reposiTUm)
Carpen-Amarie, A., Hunold, S., & Träff, J. L. (2016). On the Expected and Observed Communication Performance with MPI Derived Datatypes. In D. Holmes, A. Collis, J. L. Träff, & L. Smith (Eds.), Proceedings of the 23rd European MPI Users’ Group Meeting. ACM. https://doi.org/10.1145/2966884.2966905 ( reposiTUm)
Träff, J. L. (2016). A Library for Advanced Datatype Programming. In D. Holmes, A. Collis, J. L. Träff, & L. Smith (Eds.), Proceedings of the 23rd European MPI Users’ Group Meeting. ACM. https://doi.org/10.1145/2966884.2966904 ( reposiTUm)
Ganian, R., Kalany, M., Szeider, S., & Träff, J. L. (2016). Polynomial-Time Construction of Optimal MPI Derived Datatype Trees. In 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE 30th International Parallel and Distributed Processing Symposium (IPDPS 2016), Chicago, United States of America (the). IEEE Computer Society. https://doi.org/10.1109/ipdps.2016.13 ( reposiTUm)
Gruber, J., Träff, J. L., & Wimmer, M. (2016). Brief Announcement: Benchmarking Concurrent Priority Queues: In SPAA ’16: Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures (pp. 361–362). ACM. https://doi.org/10.1145/2935764.2935803 ( reposiTUm)
Kalany, M., & Träff, J. L. (2015). Efficient, Optimal MPI Datatype Reconstruction for Vector and Index Types. In J. Dongarra, A. Denis, B. Goglin, E. Jeannot, & G. Mercier (Eds.), Proceedings of the 22nd European MPI Users’ Group Meeting. ACM. https://doi.org/10.1145/2802658.2802671 ( reposiTUm)
Träff, J. L., Lübbe, F. D., Rougier, A., & Hunold, S. (2015). Isomorphic, Sparse MPI-like Collective Communication Operations for Parallel Stencil Computations. In J. Dongarra, A. Denis, B. Goglin, E. Jeannot, & G. Mercier (Eds.), Proceedings of the 22nd European MPI Users’ Group Meeting. ACM. https://doi.org/10.1145/2802658.2802663 ( reposiTUm)
Träff, J. L., & Lübbe, F. D. (2015). Specification Guideline Violations by MPI_Dims_create. In J. Dongarra, A. Denis, B. Goglin, E. Jeannot, & G. Mercier (Eds.), Proceedings of the 22nd European MPI Users’ Group Meeting. ACM. https://doi.org/10.1145/2802658.2802677 ( reposiTUm)
Wimmer, M., Gruber, J., Träff, J. L., & Tsigas, P. (2015). The lock-free k-LSM relaxed priority queue. In A. Cohen & D. Grove (Eds.), Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM. https://doi.org/10.1145/2688500.2688547 ( reposiTUm)
Träff, J. L., Rougier, A., & Hunold, S. (2014). Implementing a classic. In M. Gerndt, P. Stenström, L. Rauchwerger, B. Miller, & M. Schulz (Eds.), Proceedings of the 28th ACM international conference on Supercomputing - ICS ’14. ACM. https://doi.org/10.1145/2597652.2597662 ( reposiTUm)
Wimmer, M., Versaci, F., Träff, J. L., Cederman, D., & Tsigas, P. (2014). Data structures for task-based priority scheduling. In Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP ’14. 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2014, Orlando, United States of America (the). ACM. https://doi.org/10.1145/2555243.2555278 ( reposiTUm)
Träff, J. L., & Rougier, A. (2014). MPI Collectives and Datatypes for Hierarchical All-to-all Communication. In J. Dongarra, Y. Ishikawa, & A. Hori (Eds.), Proceedings of the 21st European MPI Users’ Group Meeting. ACM. https://doi.org/10.1145/2642769.2642770 ( reposiTUm)
Träff, J. L. (2014). Optimal MPI Datatype Normalization for Vector and Index-block Types. In J. Dongarra, Y. Ishikawa, & A. Hori (Eds.), Proceedings of the 21st European MPI Users’ Group Meeting. ACM. https://doi.org/10.1145/2642769.2642771 ( reposiTUm)
Träff, J. L., & Rougier, A. (2014). Zero-copy, Hierarchical Gather is not possible with MPI Datatypes and Collectives. In J. Dongarra, Y. Ishikawa, & A. Hori (Eds.), Proceedings of the 21st European MPI Users’ Group Meeting. ACM. https://doi.org/10.1145/2642769.2642772 ( reposiTUm)
Hunold, S., Carpen-Amarie, A., & Träff, J. L. (2014). Reproducible MPI Micro-Benchmarking Isn’t As Easy As You Think. In J. Dongarra, Y. Ishikawa, & A. Hori (Eds.), Proceedings of the 21st European MPI Users’ Group Meeting. ACM. https://doi.org/10.1145/2642769.2642785 ( reposiTUm)
Wimmer, M., Cederman, D., Träff, J. L., & Tsigas, P. (2013). Work-stealing with configurable scheduling strategies. In Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP ’13. ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2013, Shenzhen, China. ACM. https://doi.org/10.1145/2442516.2442562 ( reposiTUm)
Wimmer, M., Pöter, M., & Träff, J. L. (2013). The Pheet Task-Scheduling Framework on the Intel® Xeon Phi Coprocessor and other Multicore Architectures. In 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum. Workshop on Multithreaded Architectures and Applications (MTAAP 2013) in conjunction with IPDPS 2013, Boston, United States of America (the). IEEE Computer Society. https://doi.org/10.1109/ipdpsw.2013.22 ( reposiTUm)
Kessler, C., Dastgeer, U., Majeed, M., Furmento, N., Thibault, S., Namyst, R., Benkner, S., Pllana, S., Träff, J. L., & Wimmer, M. (2012). Abstract: Leveraging PEPPHER Technology for Performance Portable Supercomputing. In 2012 SC Companion: High Performance Computing, Networking Storage and Analysis. IEEE Computer Society. https://doi.org/10.1109/sc.companion.2012.212 ( reposiTUm)
Träff, J. L. (2012). mpicroscope: Towards an MPI Benchmark Tool for Performance Guideline Verification. In J. L. Träff, S. Benkner, & J. Dongarra (Eds.), Recent Advances in the Message Passing Interface Proceedings of the 19th European MPI Users’ Group Meeting, EuroMPI 2012 (pp. 100–109). Springer. http://hdl.handle.net/20.500.12708/54224 ( reposiTUm)
Siebert, C., & Träff, J. L. (2012). Efficient MPI Implementation of a Parallel, Stable Merge Algorithm. In J. L. Träff, S. Benkner, & J. Dongarra (Eds.), Recent Advances in the Message Passing Interface Proceedings of the 19th European MPI Users’ Group Meeting, EuroMPI 2012 (pp. 204–213). Springer. http://hdl.handle.net/20.500.12708/54228 ( reposiTUm)
Kessler, C., Dastgeer, U., Majeed, M., Furmento, N., Thibault, S., Namyst, R., Benkner, S., Pllana, S., Träff, J. L., & Wimmer, M. (2012). Poster: Leveraging PEPPHER Technology for Performance Portable Supercomputing. In 2012 SC Companion: High Performance Computing, Networking Storage and Analysis. Supercomputing 2012 Conference, Salt Lake City, United States of America (the). IEEE Computer Society. https://doi.org/10.1109/sc.companion.2012.213 ( reposiTUm)
Kessler, C., Dastgeer, U., Thibault, S., Namyst, R., Richards, A., Dolinsky, U., Benkner, S., Träff, J. L., & Pllana, S. (2012). Programmability and Performance Portability Aspects of Heterogeneous Multi-/Manycore Systems. In Design, Automation & Test in Europe Conference & Exhibition (DATE 2012) Proceedings (pp. 1403–1408). EDAA. http://hdl.handle.net/20.500.12708/54301 ( reposiTUm)

Tagungsbände

Hoefler, T., & Träff, J. L. (Eds.). (2019). Proceedings of the 26th European MPI Users’ Group Meeting, EuroMPI 2019. ACM. http://hdl.handle.net/20.500.12708/24628 ( reposiTUm)
Holmes, D., Collis, A., Träff, J. L., & Smith, L. (Eds.). (2016). Proceedings of the 23rd European MPI Users’ Group Meeting, EuroMPI 2016. ACM. http://hdl.handle.net/20.500.12708/24173 ( reposiTUm)
Euro-Par 2015: Parallel Processing. (2015). In J. L. Träff, S. Hunold, & F. Versaci (Eds.), Lecture Notes in Computer Science. Springer-Verlag Berlin Heidelberg. https://doi.org/10.1007/978-3-662-48096-0 ( reposiTUm)
Träff, J. L., Benkner, S., & Dongarra, J. (Eds.). (2012). Recent Advances in the Message Passing Interface Proceedings of the 19th European MPI Users’ Group Meeting, EuroMPI 2012, LNCS 7490. Springer. http://hdl.handle.net/20.500.12708/23532 ( reposiTUm)
Alexander, M., D`Ambra, P., Belloum, A., Bosilca, G., Cannataro, M., Danelutto, M., Di Martino, B., Gerndt, M., Jeannot, E., Namyst, R., Roman, J., Scott, S. L., Träff, J. L., Vallée, G., & Weidendorfer, J. (Eds.). (2012). Euro-Par 2011: Parallel Processing Workshops. Springer. https://doi.org/10.1007/978-3-642-29740-3 ( reposiTUm)

Präsentationen

Träff, J. L. (2020). Decomposing MPI Collectives for Exploiting Multi-lane Communication. SPCL_Bcast, ETH Zürich, Zürich, Switzerland. http://hdl.handle.net/20.500.12708/87082 ( reposiTUm)
Träff, J. L. (2019). On Optimal Trees for Irregular Gather and Scatter Collectives. Kolloquium Mathematische Informatik, Goethe-Universität Frankfurt am Main, Frankfurt am Main, Germany. http://hdl.handle.net/20.500.12708/86874 ( reposiTUm)
Träff, J. L. (2019). On optimal Trees for irregular gather and scatter collectives? FernUniversität in Hagen, Prof. Dr. Jörg Keller, Hagen, Germany. http://hdl.handle.net/20.500.12708/86906 ( reposiTUm)
Träff, J. L. (2019). Cartesian Collective Communication: “Advice to users”, “Advice to implementers”, and “Advice to Standardizers.” University of Bordeaux, Bordeaux, France. http://hdl.handle.net/20.500.12708/86914 ( reposiTUm)
Träff, J. L. (2019). On optimal Trees for irregular gather and scatter collectives? Humboldt-Universität zu Berlin, Research Group on Modeling and Analysis of Complex Systems, Berlin, Germany. http://hdl.handle.net/20.500.12708/86893 ( reposiTUm)
Träff, J. L. (2018). On Optimal trees for Irregular Gather and Scatter Collectives. Invited Talk at the Uppsala University (Sept-2018), Uppsala, Sweden. http://hdl.handle.net/20.500.12708/86726 ( reposiTUm)
Träff, J. L. (2017). Fast Processing of MPI Derived Datatypes? Mini Workshop Algorithms Engineering, Uni Wien, Vienna, Austria, Austria. http://hdl.handle.net/20.500.12708/86518 ( reposiTUm)
Träff, J. L. (2017). High Performance Expectations for MPI. Friedrich-Alexander-Universität Erlangen-Nürnberg, Prof. Dr. Gerhard Wellein, Erlangen, Germany. http://hdl.handle.net/20.500.12708/86505 ( reposiTUm)
Träff, J. L. (2017). The past 25 years of MPI. Panel at ISC High Performance Conference 2017 - The HPC Event, Intel booth, Frankfurt, Germany. http://hdl.handle.net/20.500.12708/86517 ( reposiTUm)
Träff, J. L. (2016). Effective MPI Programming: Concepts, Advanced Features, Do’s and Don’ts. Vienna Scientific Cluster: VSC School Seminar, TU Wien, Vienna, Austria, Austria. http://hdl.handle.net/20.500.12708/86253 ( reposiTUm)
Träff, J. L. (2016). On The Power of Structured Data in MPI. Guest Lecture of the course: Parallel and High Performance Computing, LMU Munich, München, Germany. http://hdl.handle.net/20.500.12708/86357 ( reposiTUm)
Träff, J. L. (2016). Polynomial-Time Construction of Optimal MPI Derived Datatype Trees. Leibniz-Rechenzentrum (LRZ), Garching bei München, Germany. http://hdl.handle.net/20.500.12708/86364 ( reposiTUm)
Träff, J. L. (2016). Tutorial: Effective MPI Programming: concepts, advanced features, do’s and dont’s. Tutorial on MPI at the 22nd International European Conference on Parallel and Distributed Computing (Euro-Par 2016), Grenoble, France. http://hdl.handle.net/20.500.12708/86292 ( reposiTUm)
Träff, J. L. (2015). MPI Datatype reconstruction (for vector and index types). Compilers and Languages Group, Institute of Computer Languages, TU Wien, Vienna, Austria, Austria. http://hdl.handle.net/20.500.12708/86126 ( reposiTUm)
Träff, J. L. (2015). The Relative Power of Synchronization Primitives. Computational Mathematics in Engineering Group - Prof. Dr. Joachim Schöberl, Institute for Analysis and Scientific Computing, TU Wien, Vienna, Austria, Austria. http://hdl.handle.net/20.500.12708/86043 ( reposiTUm)
Träff, J. L. (2015). The Power of Structured Data in MPI. The University of Texas at Austin, Prof. Robert A. van de Geijn, Austin, United States of America (the). http://hdl.handle.net/20.500.12708/86053 ( reposiTUm)
Träff, J. L. (2014). The Power of Structured Data in MPI. Research Group Theory and Applications of Algorithms and Research Group Scientific Computing, University of Vienna, Vienna, Austria, Austria. http://hdl.handle.net/20.500.12708/85825 ( reposiTUm)
Hunold, S., Carpen-Amarie, A., & Träff, J. L. (2014). Reproducible MPI Micro-Benchmarking Isn’t As Easy As You Think. Research Group Theory and Applications of Algorithms, University of Vienna, Vienna, Austria, Austria. http://hdl.handle.net/20.500.12708/85872 ( reposiTUm)
Träff, J. L. (2014). The Power of Structured Data in MPI. Compiler Technology and Computer Architecure Group at the University of Hertfordshire, Hertfordshire, United Kingdom of Great Britain and Northern Ireland (the). http://hdl.handle.net/20.500.12708/85832 ( reposiTUm)
Träff, J. L. (2014). The Power of Structured Data in MPI. I3MS Seminar Series, Aachen GRS, RWTH Aachen, Aachen, Germany. http://hdl.handle.net/20.500.12708/85805 ( reposiTUm)
Träff, J. L. (2014). Implementing a classic: zero-copy all-to-all communication with MPI datatypes. Department of Computer Science, University of Copenhagen, Copenhagen, Denmark. http://hdl.handle.net/20.500.12708/85783 ( reposiTUm)
Träff, J. L. (2013). Large-scale message passing concepts in EPiGRAM. Workshop on Exascale MPI (ExaMPI 2013) at Supercomputing Conference 2013, Denver, United States of America (the). http://hdl.handle.net/20.500.12708/85624 ( reposiTUm)
Träff, J. L. (2013). Challenges in Message-Passing Interfaces for Large-Scale Parallel Systems. UPMARC Summer School on Multicore Computing, Uppsala, Sweden. http://hdl.handle.net/20.500.12708/85569 ( reposiTUm)
Wimmer, M., Cederman, D., Träff, J. L., & Tsigas, P. (2013). Work-stealing with Configurable Scheduling Strategies. MADALGO Summer School on DATA STRUCTURES, Aarhus, Denmark. http://hdl.handle.net/20.500.12708/85595 ( reposiTUm)
Träff, J. L. (2013). Unique Features of MPI: Collective Operations on Structured Data. 20th European MPI Users’ Group Meeting, EuroMPI 2013, Madrid, Spain. http://hdl.handle.net/20.500.12708/84678 ( reposiTUm)
Träff, J. L. (2013). History of MPI. UPMARC Summer School on Multicore Computing, Uppsala, Sweden. http://hdl.handle.net/20.500.12708/85614 ( reposiTUm)
Träff, J. L. (2012). Scalability, Expressivity and Performance Portability of Message-Passing Interface(s). VSC Workshop Vienna Scientific Cluster, Neusiedl/See, Austria, Austria. http://hdl.handle.net/20.500.12708/85337 ( reposiTUm)
Träff, J. L. (2012). History and development of the MPI standard. AIT Austrian Institute of Technology, Seibersdorf, Austria, Austria. http://hdl.handle.net/20.500.12708/85424 ( reposiTUm)
Träff, J. L. (2012). History and Development of MPI: the Message-Passing Interface. CSE Day KTH Stockholm, Stockholm, Sweden. http://hdl.handle.net/20.500.12708/85393 ( reposiTUm)
Gropp, W., Hoefler, T., Thakur, R., & Träff, J. L. (2011). Performance Expectations and Guidelines for MPI Derived Datatypes. EuroMPI 2011, Santorini, Greece, EU. http://hdl.handle.net/20.500.12708/85310 ( reposiTUm)
Bajrovic, E., & Träff, J. L. (2011). Using MPI Derived Datatypes in Numerical Libraries. EuroMPI 2011, Santorini, Greece, EU. http://hdl.handle.net/20.500.12708/85309 ( reposiTUm)

Preprints

Träff, J. L. (2025). Optimal, Non-pipelined Reduce-scatter and Allreduce Algorithms. arXiv. https://doi.org/10.34726/10760 ( reposiTUm)
Träff, J. L. (2025). Communication Round and Computation Efficient Exclusive Prefix-Sums Algorithms (for MPI_Exscan). arXiv. https://doi.org/10.34726/10821 ( reposiTUm)
Träff, J. L. (2024). Lectures on Parallel Computing. arXiv. https://doi.org/10.34726/10819 ( reposiTUm)
Träff, J. L. (2024). Optimal Broadcast Schedules in Logarithmic Time with Applications to Broadcast, All-Broadcast, Reduction and All-Reduction. arXiv. https://doi.org/10.34726/10820 ( reposiTUm)
Träff, J. L. (2023). Round-optimal 𝑛-Block Broadcast Schedules in Logarithmic Time. arXiv. https://doi.org/10.34726/7320 ( reposiTUm)
Träff, J. L. (2022). (Poly)Logarithmic Time Construction of Round-optimal n-Block Broadcast Schedules for Broadcast and irregular Allgather in MPI. arXiv. https://doi.org/10.48550/arXiv.2205.10072 ( reposiTUm)
Träff, J. L. (2021). A Doubly-pipelined, Dual-root Reduction-to-all Algorithm and Implementation. arXiv. https://doi.org/10.48550/arXiv.2109.12626 ( reposiTUm)
Träff, J. L., & Pöter, M. (2020). A more Pragmatic Implementation of the Lock-free, Ordered, Linked List. arXiv. https://doi.org/10.48550/arXiv.2010.15755 ( reposiTUm)
Faraj, M. F., van der Grinten, A., Meyerhenke, H., Träff, J. L., & Schulz, C. (2020). High-Quality Hierarchical Process Mapping. arXiv. https://doi.org/10.48550/arXiv.2001.07134 ( reposiTUm)
Hunold, S., von Kirchbach, K., Lehr, M., Schulz, C., & Träff, J. L. (2020). Efficient Process-to-Node Mapping Algorithms for Stencil Computations. arXiv. https://doi.org/10.48550/arXiv.2005.09521 ( reposiTUm)
Träff, J. L. (2020). k-ported vs. k-lane Broadcast, Scatter, and Alltoall Algorithms. arXiv. https://doi.org/10.48550/arXiv.2008.12144 ( reposiTUm)
Kainer, M., & Träff, J. L. (2019). More Parallelism in Dijkstra’s Single-Source Shortest Path Algorithm. arXiv. https://doi.org/10.48550/arXiv.1903.12085 ( reposiTUm)
Träff, J. L. (2019). Decomposing Collectives for Exploiting Multi-lane Communication. arXiv. https://doi.org/10.48550/arXiv.1910.13373 ( reposiTUm)
Pachajoa, C., Levonyak, M., Gansterer, W., & Träff, J. L. (2019). How to Make the Preconditioned Conjugate Gradient Method Resilient Against Multiple Node Failures (1907.13077). arXiv. https://doi.org/10.48550/arXiv.1907.13077 ( reposiTUm)
Pöter, M., & Träff, J. L. (2018). Stamp-it: A more Thread-efficient, Concurrent Memory Reclamation Scheme in the C++ Memory Model. arXiv. https://doi.org/10.48550/arXiv.1805.08639 ( reposiTUm)
Träff, J. L. (2018). Parallel Quicksort without Pairwise Element Exchange. arXiv. https://doi.org/10.48550/arXiv.1804.07494 ( reposiTUm)
Pöter, M., & Träff, J. L. (2018). Memory Models for C/C++ Programmers. arXiv. https://doi.org/10.48550/arXiv.1803.04432 ( reposiTUm)
Schulz, C., & Träff, J. L. (2017). VieM v1.00 - Vienna Mapping and Sparse Quadratic Assignment User Guide. arXiv. https://doi.org/10.48550/arXiv.1703.05509 ( reposiTUm)
Träff, J. L. (2017). Practical, Linear-time, Fully Distributed Algorithms for Irregular Gather and Scatter (1702.05967). arXiv. https://doi.org/10.48550/arXiv.1702.05967 ( reposiTUm)
Träff, J. L. (2017). On Optimal Trees for Irregular Gather and Scatter Collectives. arXiv. https://doi.org/10.48550/arXiv.1711.08731 ( reposiTUm)
Schulz, C., & Träff, J. L. (2017). Better Process Mapping and Sparse Quadratic Assignment. arXiv. https://doi.org/10.48550/arXiv.1702.04164 ( reposiTUm)
Pöter, M., & Träff, J. L. (2017). A new and five older Concurrent Memory Reclamation Schemes in Comparison (Stamp-it). arXiv. https://doi.org/10.48550/arXiv.1712.06134 ( reposiTUm)
Hunold, S., Carpen-Amarie, A., Lübbe, F. D., & Träff, J. L. (2016). PGMPI: Automatically Verifying Self-Consistent MPI Performance Guidelines. arXiv. https://doi.org/10.48550/arXiv.1606.00215 ( reposiTUm)
Carpen-Amarie, A., Hunold, S., & Träff, J. L. (2016). MPI Derived Datatypes: Performance Expectations and Status Quo. arXiv. https://doi.org/10.48550/arXiv.1607.00178 ( reposiTUm)
Gruber, J., Träff, J. L., & Wimmer, M. (2016). Benchmarking Concurrent Priority Queues: Performance of k-LSM and Related Data Structures. arXiv. https://doi.org/10.48550/arXiv.1603.05047 ( reposiTUm)
Träff, J. L., Carpen-Amarie, A., Hunold, S., & Rougier, A. (2016). Message-Combining Algorithms for Isomorphic, Sparse Collective Communication. arXiv. https://doi.org/10.48550/arXiv.1606.07676 ( reposiTUm)
Wimmer, M., Gruber, J., Träff, J. L., & Tsigas, P. (2015). The Lock-free k-LSM Relaxed Priority Queue. arXiv. https://doi.org/10.48550/arXiv.1503.05698 ( reposiTUm)
Ganian, R., Kalany, M., Szeider, S., & Träff, J. L. (2015). Polynomial-time Construction of Optimal Tree-structured Communication Data Layout Descriptions. arXiv. https://doi.org/10.48550/arXiv.1506.09100 ( reposiTUm)
Träff, J. L. (2015). The Shortest Path Problem with Edge Information Reuse is NP-Complete. arXiv. https://doi.org/10.48550/arXiv.1509.05637 ( reposiTUm)
Träff, J. L., & Wimmer, M. (2014). An improved, easily computable combinatorial lower bound for weighted graph bipartitioning. arXiv. https://doi.org/10.48550/arXiv.1410.0462 ( reposiTUm)
Wimmer, M., Cederman, D., Versaci, F., Träff, J. L., & Tsigas, P. (2013). Data Structures for Task-based Priority Scheduling. arXiv. https://doi.org/10.48550/arXiv.1312.2501 ( reposiTUm)
Wimmer, M., Cederman, D., Träff, J. L., & Tsigas, P. (2013). Configurable Strategies for Work-stealing. arXiv. https://doi.org/10.48550/arXiv.1305.6474 ( reposiTUm)
Hunold, S., & Träff, J. L. (2013). On the State and Importance of Reproducible Experimental Research in Parallel Computing. arXiv. https://doi.org/10.48550/arXiv.1308.3648 ( reposiTUm)
Träff, J. L. (2013). A Note on (Parallel) Depth- and Breadth-First Search by Arc Elimination. arXiv. https://doi.org/10.48550/arXiv.1305.1222 ( reposiTUm)
Siebert, C., & Träff, J. L. (2013). Perfectly load-balanced, optimal, stable, parallel merge. arXiv. https://doi.org/10.48550/arXiv.1303.4312 ( reposiTUm)
Träff, J. L. (2012). Simplified, stable parallel merging. arXiv. https://doi.org/10.48550/arXiv.1202.6575 ( reposiTUm)