Revisiting I/O behavior in large-scale storage systems: The expected and the unexpected

T Patel, S Byna, GK Lockwood, D Tiwari - Proceedings of the …, 2019 - dl.acm.org
Large-scale applications typically spend a large fraction of their execution time performing
I/O to a parallel storage system. However, with rapid progress in compute and storage …

ElastiSim: a batch-system simulator for malleable workloads

T Özden, T Beringer, A Mazaheri, HM Fard… - Proceedings of the 51st …, 2022 - dl.acm.org
As high-performance computing infrastructures move towards exascale, the role of resource
and job management systems is more critical now than ever. Simulating batch systems to …

Prionn: Predicting runtime and io using neural networks

MR Wyatt, S Herbein, T Gamblin, A Moody… - Proceedings of the 47th …, 2018 - dl.acm.org
For job allocation decision, current batch schedulers have access to and use only
information on the number of nodes and runtime because it is readily available at …

Zero-cycle loads: Microarchitecture support for reducing load latency

TM Austin, GS Sohi - Proceedings of the 28th annual …, 1995 - ieeexplore.ieee.org
Untolerated load instruction latencies often have a significant impact on overall program
performance. As one means of mitigating this effect we present an aggressive hardware …

Influence of noisy environments on behavior of HPC applications

DA Nikitenko, F Wolf, B Mohr, T Hoefler… - Lobachevskii journal of …, 2021 - Springer
Many contemporary HPC systems expose their jobs to substantial amounts of interference,
leading to significant run-to-run variation. For example, application runtimes on Theta, a …

Quantifying i/o and communication traffic interference on dragonfly networks equipped with burst buffers

M Mubarak, P Carns, J Jenkins, JK Li… - … on cluster computing …, 2017 - ieeexplore.ieee.org
HPC systems have shifted to burst buffer storage and high radix interconnect topologies in
order to meet the challenges of large-scale, data-intensive scientific computing. Both of …

Evaluation of an interference-free node allocation policy on fat-tree clusters

SD Pollard, N Jain, S Herbein… - … Conference for High …, 2018 - ieeexplore.ieee.org
Interference between jobs competing for network bandwidth on a fat-tree cluster can cause
significant variability and degradation in performance. These performance issues can be …

Performance characterization of scientific workflows for the optimal use of burst buffers

CS Daley, D Ghoshal, GK Lockwood, S Dosanjh… - Future Generation …, 2020 - Elsevier
Scientific discoveries are increasingly dependent upon the analysis of large volumes of data
from observations and simulations of complex phenomena. Scientists compose the complex …

{GIFT}: A coupon based {Throttle-and-Reward} mechanism for fair and efficient {I/O} bandwidth management on parallel storage systems

T Patel, R Garg, D Tiwari - 18th USENIX Conference on File and Storage …, 2020 - usenix.org
Large-scale parallel applications are highly data-intensive and perform terabytes of I/O
routinely. Unfortunately, on a large-scale system where multiple applications run …

Evaluating burst buffer placement in hpc systems

H Khetawat, C Zimmer, F Mueller… - 2019 IEEE …, 2019 - ieeexplore.ieee.org
Burst buffers (BBs) are increasingly exploited in contemporary supercomputers to bridge the
performance gap between compute and storage systems. The design of BBs, particularly the …