Fine-grained energy efficiency using per-core dvfs with an adaptive runtime system
B Acun, K Chandrasekar… - 2019 Tenth International …, 2019 - ieeexplore.ieee.org
Dynamic voltage and frequency scaling (DVFS) is a well-known technique to reduce the
power and/or energy consumption of various applications. While most processors provide …
power and/or energy consumption of various applications. While most processors provide …
Give MPI threading a fair chance: A study of multithreaded MPI designs
T Patinyasakdikul, D Eberius… - … Conference on Cluster …, 2019 - ieeexplore.ieee.org
The Message Passing Interface (MPI) has been one of the most prominent programming
paradigms in high-performance computing (HPC) for the past decade. Lately, with changes …
paradigms in high-performance computing (HPC) for the past decade. Lately, with changes …
Improving MPI multi-threaded RMA communication performance
One-sided communication is crucial to enabling communication concurrency. As core counts
have increased, particularly with many-core architectures, one-sided (RMA) communication …
have increased, particularly with many-core architectures, one-sided (RMA) communication …
How I learned to stop worrying about user-visible endpoints and love MPI
R Zambre, A Chandramowliswharan… - Proceedings of the 34th …, 2020 - dl.acm.org
MPI+ threads is gaining prominence as an alternative to the traditional" MPI everywhere"
model in order to better handle the disproportionate increase in the number of cores …
model in order to better handle the disproportionate increase in the number of cores …
Lessons learned on MPI+ threads communication
R Zambre… - … Conference for High …, 2022 - ieeexplore.ieee.org
Hybrid MPI+ threads programming is gaining prominence, but, in practice, applications
perform slower with it compared to the MPI everywhere model. The most critical challenge to …
perform slower with it compared to the MPI everywhere model. The most critical challenge to …
Mpi sessions: Leveraging runtime infrastructure to increase scalability of applications at exascale
MPI includes all processes in MPI_COMM_WORLD; this is untenable for reasons of scale,
resiliency, and overhead. This paper offers a new approach, extending MPI with a new …
resiliency, and overhead. This paper offers a new approach, extending MPI with a new …
Exampi: A modern design and implementation to accelerate message passing interface innovation
The difficulty of deep experimentation with Message Passing Interface (MPI)
implementations—which are quite large and complex—substantially raises the cost and …
implementations—which are quite large and complex—substantially raises the cost and …
RMA-MT: a benchmark suite for assessing MPI multi-threaded RMA performance
MGF Dosanjh, T Groves, RE Grant… - 2016 16th IEEE/ACM …, 2016 - ieeexplore.ieee.org
Reaching Exascale will require leveraging massive parallelism while potentially leveraging
asynchronous communication to help achieve scalability at such large levels of concurrency …
asynchronous communication to help achieve scalability at such large levels of concurrency …
Process-in-process: techniques for practical address-space sharing
The two most common parallel execution models for many-core CPUs today are
multiprocess (eg, MPI) and multithread (eg, OpenMP). The multiprocess model allows each …
multiprocess (eg, MPI) and multithread (eg, OpenMP). The multiprocess model allows each …
Fuzzy matching: Hardware accelerated mpi communication middleware
Contemporary parallel scientific codes often rely on message passing for inter-process
communication. However, inefficient coding practices or multithreading (eg, via …
communication. However, inefficient coding practices or multithreading (eg, via …