An autonomic performance environment for exascale
Exascale systems will require new approaches to performance observation, analysis, and
runtime decision-making to optimize for performance and efficiency. The standard" first …
runtime decision-making to optimize for performance and efficiency. The standard" first …
Predicting MPI collective communication performance using machine learning
The Message Passing Interface (MPI) defines the semantics of data communication
operations, while the implementing libraries provide several parameterized algorithms for …
operations, while the implementing libraries provide several parameterized algorithms for …
MPI performance engineering with the MPI tool interface: the integration of MVAPICH and TAU
MPI implementations are becoming increasingly complex and highly tunable, and thus
scalability limitations can come from numerous sources. The MPI Tools Interface (MPI_T) …
scalability limitations can come from numerous sources. The MPI Tools Interface (MPI_T) …
Autotuning MPI collectives using performance guidelines
S Hunold, A Carpen-Amarie - … of the International Conference on High …, 2018 - dl.acm.org
MPI collective operations provide a standardized interface for performing data movements
within a group of processes. The efficiency of collective communication operations depends …
within a group of processes. The efficiency of collective communication operations depends …
A survey of methods for collective communication optimization and tuning
U Wickramasinghe, A Lumsdaine - arXiv preprint arXiv:1611.06334, 2016 - arxiv.org
New developments in HPC technology in terms of increasing computing power on
multi/many core processors, high-bandwidth memory/IO subsystems and communication …
multi/many core processors, high-bandwidth memory/IO subsystems and communication …
ACCLAiM: Advancing the practicality of MPI collective communication autotuning using machine learning
MPI collective communication is an omnipresent communication model for high-
performance computing (HPC) systems. The performance of a collective operation depends …
performance computing (HPC) systems. The performance of a collective operation depends …
Autotuning of MPI applications using PTF
The main problem when trying to optimize the parameters of libraries, such as MPI, is that
there are many parameters that users can configure. Moreover, predicting the behavior of …
there are many parameters that users can configure. Moreover, predicting the behavior of …
Locality and topology aware intra-node communication among multicore CPUs
A major trend in HPC is the escalation toward manycore, where systems are composed of
shared memory nodes featuring numerous processing units. Unfortunately, with scale …
shared memory nodes featuring numerous processing units. Unfortunately, with scale …
Optimizing mpi runtime parameter settings by using machine learning
S Pellegrini, J Wang, T Fahringer… - Recent Advances in …, 2009 - Springer
Manually tuning MPI runtime parameters is a practice commonly employed to optimise MPI
application performance on a specific architecture. However, the best setting for these …
application performance on a specific architecture. However, the best setting for these …
A FACT-based approach: Making machine learning collective autotuning feasible on exascale systems
According to recent performance analyses, MPI collective operations make up a quarter of
the execution time on production systems. Machine learning (ML) autotuners use supervised …
the execution time on production systems. Machine learning (ML) autotuners use supervised …