GRAP: group-level resource allocation policy for reconfigurable Dragonfly network in HPC

G Feng, D Dong, S Zhao, Y Lu - … of the 37th International conference on …, 2023 - dl.acm.org
Dragonfly is a highly scalable, low-diameter, and cost-efficient network topology, which has
been adopted in new exascale High Performance Computing (HPC) systems. However …

An analysis of long-tailed network latency distribution and background traffic on dragonfly+

M Salimi Beni, B Cosenza - International Symposium on Benchmarking …, 2022 - Springer
Modern computing systems are highly affected by large performance variability, resulting in
a long tail in the distribution of the network latency. For communication-intensive …

RaDD runtimes: Radical and different distributed runtimes with smartnics

RE Grant, W Schonbein, S Levy - 2020 IEEE/ACM Fourth …, 2020 - ieeexplore.ieee.org
As network speeds increase, the overhead of processing incoming messages is becoming
onerous enough that many manufacturers now provide network interface cards (NICs) with …

Analysis and prediction of performance variability in large-scale computing systems

M Salimi Beni, S Hunold, B Cosenza - The Journal of Supercomputing, 2024 - Springer
The development of new exascale supercomputers has dramatically increased the need for
fast, high-performance networking technology. Efficient network topologies, such as …

Faster and Scalable MPI Applications Launching

Y Dong, Y Dai, M Xie, K Lu, R Wang… - … on Parallel and …, 2022 - ieeexplore.ieee.org
Distributed parallel MPI applications are the dominant workload in many high-performance
computing systems. While optimizing MPI application execution is a well-studied field, little …

Energy-Efficient Interconnection Networks for High-Performance Computing

F Zahn - 2020 - archiv.ub.uni-heidelberg.de
In recent years, energy has become one of the most important factors for de-signing and
operating large scale computing systems. This is particularly true in high-performance …