HPX--an open source C++ standard library for parallelism and concurrency
To achieve scalability with today's heterogeneous HPC resources, we need a dramatic shift
in our thinking; MPI+ X is not enough. Asynchronous Many Task (AMT) runtime systems …
in our thinking; MPI+ X is not enough. Asynchronous Many Task (AMT) runtime systems …
Enabling communication concurrency through flexible MPI endpoints
MPI defines a one-to-one relationship between MPI processes and ranks. This model
captures many use cases effectively; however, it also limits communication concurrency and …
captures many use cases effectively; however, it also limits communication concurrency and …
MT-MPI: Multithreaded MPI for many-core environments
Many-core architectures, such as the Intel Xeon Phi, provide dozens of cores and hundreds
of hardware threads. To utilize such architectures, application programmers are increasingly …
of hardware threads. To utilize such architectures, application programmers are increasingly …
Enabling efficient multithreaded MPI communication through a library-based implementation of MPI endpoints
S Sridharan, J Dinan… - SC'14: Proceedings of the …, 2014 - ieeexplore.ieee.org
Modern high-speed interconnection networks are designed with capabilities to support
communication from multiple processor cores. The MPI endpoints extension has been …
communication from multiple processor cores. The MPI endpoints extension has been …
Why is MPI so slow? analyzing the fundamental limits in implementing MPI-3.1
This paper provides an in-depth analysis of the software overheads in the MPI performance-
critical path and exposes mandatory performance overheads that are unavoidable based on …
critical path and exposes mandatory performance overheads that are unavoidable based on …
How I learned to stop worrying about user-visible endpoints and love MPI
R Zambre, A Chandramowliswharan… - Proceedings of the 34th …, 2020 - dl.acm.org
MPI+ threads is gaining prominence as an alternative to the traditional" MPI everywhere"
model in order to better handle the disproportionate increase in the number of cores …
model in order to better handle the disproportionate increase in the number of cores …
Lessons learned on MPI+ threads communication
R Zambre… - … Conference for High …, 2022 - ieeexplore.ieee.org
Hybrid MPI+ threads programming is gaining prominence, but, in practice, applications
perform slower with it compared to the MPI everywhere model. The most critical challenge to …
perform slower with it compared to the MPI everywhere model. The most critical challenge to …
Callback-based completion notification using MPI Continuations
J Schuchart, P Samfass, C Niethammer, J Gracia… - Parallel Computing, 2021 - Elsevier
Asynchronous programming models (APM) are gaining more and more traction, allowing
applications to expose the available concurrency to a runtime system tasked with …
applications to expose the available concurrency to a runtime system tasked with …
Process-in-process: techniques for practical address-space sharing
The two most common parallel execution models for many-core CPUs today are
multiprocess (eg, MPI) and multithread (eg, OpenMP). The multiprocess model allows each …
multiprocess (eg, MPI) and multithread (eg, OpenMP). The multiprocess model allows each …
A heterogeneous MPI+ PPL task scheduling approach for asynchronous many-task runtime systems
Asynchronous many-task runtime systems and MPI+ X hybrid parallelism approaches have
shown promise for helping manage the increasing complexity of nodes in current and …
shown promise for helping manage the increasing complexity of nodes in current and …