- 学术资源搜索

MPI performance engineering with the MPI tool interface: the integration of MVAPICH and TAU

S Ramesh, A Mahéo, S Shende, AD Malony… - Proceedings of the 24th …, 2017 - dl.acm.org

MPI implementations are becoming increasingly complex and highly tunable, and thus
scalability limitations can come from numerous sources. The MPI Tools Interface (MPI_T) …

被引用次数：34 相关文章所有 8 个版本

[PDF] nsf.gov

Designing a profiling and visualization tool for scalable and in-depth analysis of high-performance GPU clusters

P Kousha, B Ramesh, KK Suresh… - 2019 IEEE 26th …, 2019 - ieeexplore.ieee.org

The recent advent of advanced fabrics like NVIDIA NVLink is enabling the deployment of
dense Graphics Processing Unit (GPU) systems, eg, DGX-2 and Summit. The Message …

被引用次数：22 相关文章所有 3 个版本

[PDF] ohio-state.edu

INAM²: InfiniBand Network Analysis and Monitoring with MPI

H Subramoni, AM Augustine, M Arnold… - … Conference on High …, 2016 - Springer

Modern high-end computing is being driven by the tight integration of several hardware and
software components. On the hardware front, there are the multi-/many-core architectures …

被引用次数：27 相关文章所有 8 个版本

[PDF] arxiv.org

A survey of methods for collective communication optimization and tuning

U Wickramasinghe, A Lumsdaine - arXiv preprint arXiv:1611.06334, 2016 - arxiv.org

New developments in HPC technology in terms of increasing computing power on
multi/many core processors, high-bandwidth memory/IO subsystems and communication …

被引用次数：27 相关文章所有 4 个版本

[PDF] sciencedirect.com

MR-Advisor: A comprehensive tuning, profiling, and prediction tool for MapReduce execution frameworks on HPC clusters

M Wasi-ur-Rahman, NS Islam, X Lu, D Shankar… - Journal of Parallel and …, 2018 - Elsevier

MapReduce is the most popular parallel computing framework for big data processing which
allows massive scalability across distributed computing environment. Advanced RDMA …

被引用次数：21 相关文章所有 2 个版本

[PDF] sciencedirect.com

Planning for performance: Enhancing achievable performance for MPI through persistent collective operations

DJ Holmes, B Morgan, A Skjellum, PV Bangalore… - Parallel Computing, 2019 - Elsevier

Advantages of nonblocking collective communication in MPI have been established over the
past quarter century, even predating MPI-1. For regular computations with fixed …

被引用次数：13 相关文章

[PDF] fz-juelich.de

Enabling callback-driven runtime introspection via MPI_T

MA Hermanns, NT Hjlem, M Knobloch… - Proceedings of the 25th …, 2018 - dl.acm.org

Understanding the behavior of parallel applications that use the Message Passing Interface
(MPI) is critical for optimizing communication performance. Performance tools for MPI …

被引用次数：12 相关文章所有 3 个版本

Planning for performance: persistent collective operations for MPI

B Morgan, DJ Holmes, A Skjellum… - Proceedings of the 24th …, 2017 - dl.acm.org

Advantages of nonblocking collective communication in MPI have been established over the
past quarter century, even predating MPI-1. For regular computations with fixed …

被引用次数：13 相关文章

[PDF] wiley.com Full View

Communication optimization technology based on network dynamic performance model

X Cui, X Li, B Wang - Mathematical Problems in Engineering, 2020 - Wiley Online Library

This work analyses different communication modes in applications of supercomputing,
proposes a communication dynamic performance model based on topology awareness, and …

被引用次数：7 相关文章所有 10 个版本

Sonar: Automated communication characterization for hpc applications

S Lammel, F Zahn, H Fröning - … , E-MuCoCoS, HPC-IODC, IXPUG, IWOPH …, 2016 - Springer

Future computing systems will need to operate within hard power and energy constraints,
this is particularly true for Exascale-class systems. These constraints are hard for technical …

被引用次数：14 相关文章