GRAP: group-level resource allocation policy for reconfigurable Dragonfly network in HPC
Dragonfly is a highly scalable, low-diameter, and cost-efficient network topology, which has
been adopted in new exascale High Performance Computing (HPC) systems. However …
been adopted in new exascale High Performance Computing (HPC) systems. However …
An analysis of long-tailed network latency distribution and background traffic on dragonfly+
M Salimi Beni, B Cosenza - International Symposium on Benchmarking …, 2022 - Springer
Modern computing systems are highly affected by large performance variability, resulting in
a long tail in the distribution of the network latency. For communication-intensive …
a long tail in the distribution of the network latency. For communication-intensive …
RaDD runtimes: Radical and different distributed runtimes with smartnics
As network speeds increase, the overhead of processing incoming messages is becoming
onerous enough that many manufacturers now provide network interface cards (NICs) with …
onerous enough that many manufacturers now provide network interface cards (NICs) with …
Analysis and prediction of performance variability in large-scale computing systems
The development of new exascale supercomputers has dramatically increased the need for
fast, high-performance networking technology. Efficient network topologies, such as …
fast, high-performance networking technology. Efficient network topologies, such as …
Faster and Scalable MPI Applications Launching
Y Dong, Y Dai, M Xie, K Lu, R Wang… - … on Parallel and …, 2022 - ieeexplore.ieee.org
Distributed parallel MPI applications are the dominant workload in many high-performance
computing systems. While optimizing MPI application execution is a well-studied field, little …
computing systems. While optimizing MPI application execution is a well-studied field, little …
Energy-Efficient Interconnection Networks for High-Performance Computing
F Zahn - 2020 - archiv.ub.uni-heidelberg.de
In recent years, energy has become one of the most important factors for de-signing and
operating large scale computing systems. This is particularly true in high-performance …
operating large scale computing systems. This is particularly true in high-performance …