Optimization of collective communication operations in MPICH

R Thakur, R Rabenseifner… - The International Journal …, 2005 - journals.sagepub.com
We describe our work on improving the performance of collective communication operations
in MPICH for clusters connected by switched networks. For each collective operation, we …

Blink: Fast and generic collectives for distributed ml

G Wang, S Venkataraman… - Proceedings of …, 2020 - proceedings.mlsys.org
Abstract Model parameter synchronization across GPUs introduces high overheads for data-
parallel training at scale. Existing parameter synchronization protocols cannot effectively …

The Globus striped GridFTP framework and server

W Allcock, J Bresnahan, R Kettimuthu… - SC'05: Proceedings of …, 2005 - ieeexplore.ieee.org
The GridFTP extensions to the File Transfer Protocol define a general-purpose mechanism
for secure, reliable, high-performance data movement. We report here on the Globus striped …

Collective communication: theory, practice, and experience

E Chan, M Heimlich, A Purkayastha… - Concurrency and …, 2007 - Wiley Online Library
We discuss the design and high‐performance implementation of collective communications
operations on distributed‐memory computer architectures. Using a combination of known …

MPICH-G2: A grid-enabled implementation of the message passing interface

NT Karonis, B Toonen, I Foster - Journal of Parallel and Distributed …, 2003 - Elsevier
Application development for distributed-computing “Grids” can benefit from tools that
variously hide or enable application-level management of critical aspects of the …

Improving the performance of collective operations in MPICH

R Thakur, WD Gropp - … Parallel Virtual Machine/Message Passing Interface …, 2003 - Springer
We report on our work on improving the performance of collective operations in MPICH on
clusters connected by switched networks. For each collective operation, we use multiple …

Optimization of collective reduction operations

R Rabenseifner - Computational Science-ICCS 2004: 4th International …, 2004 - Springer
A 5-year-profiling in production mode at the University of Stuttgart has shown that more than
40% of the execution time of Message Passing Interface (MPI) routines is spent in the …

[图书][B] Securing the internet of things

S Misra, M Maheswaran, S Hashmi, S Misra… - 2017 - Springer
Security and privacy are the prime constraints to the popularity and acceptance of the IoT.
Figure 4.1 from [179] indicates the opinions of security personnel active in the IT space on …

Large-scale electronic structure calculations of high-Z metals on the BlueGene/L platform

F Gygi, EW Draeger, M Schulz… - Proceedings of the …, 2006 - dl.acm.org
First-principles simulations of high-Z metallic systems using the Qbox code on the
BlueGene/L supercomputer demonstrate unprecedented performance and scaling for a …

Network performance aware MPI collective communication operations in the cloud

Y Gong, B He, J Zhong - IEEE Transactions on Parallel and …, 2013 - ieeexplore.ieee.org
This paper examines the performance of collective communication operations in message
passing interfaces (MPI) in the cloud computing environment. The awareness of network …