Concealing compression-accelerated i/o for hpc applications through in situ task scheduling

S Jin, S Di, F Vivien, D Wang, Y Robert, D Tao… - Proceedings of the …, 2024 - dl.acm.org
Lossy compression and asynchronous I/O are two of the most effective solutions for reducing
storage overhead and enhancing I/O performance in large-scale high-performance …

Accelerating parallel write via deeply integrating predictive lossy compression with HDF5

S Jin, D Tao, H Tang, S Di, S Byna… - … Conference for High …, 2022 - ieeexplore.ieee.org
Lossy compression is one of the most efficient solutions to reduce storage overhead and
improve I/O performance for HPC applications. However, existing parallel I/O libraries …

Systematically inferring I/O performance variability by examining repetitive job behavior

E Costa, T Patel, B Schwaller, JM Brandt… - Proceedings of the …, 2021 - dl.acm.org
Monitoring and analyzing I/O behaviors is critical to the efficient utilization of parallel storage
systems. Unfortunately, with increasing I/O requirements and resource contention, I/O …

{RL-Watchdog}: A Fast and Predictable {SSD} Liveness Watchdog on Storage Systems

JY Ha, S Lee, HY Yeom, Y Son - 2024 USENIX Annual Technical …, 2024 - usenix.org
This paper proposes a reinforcement learning-based watchdog (RLW) that examines solid-
state drive (SSD) liveness or failures by faults (eg, controller/power faults and high …

Accelerating I/O performance of ZFS-based Lustre file system in HPC environment

J Bang, C Kim, EK Byun, H Sung, J Lee… - The Journal of …, 2023 - Springer
To meet increasing data access performance demands of applications run on high-
performance computing (HPC) systems, an efficient design of HPC storage file system is …

CCFTL: A novel continuity compressed page-level flash address mapping method for SSDs

L Su, M Lin, J Zhang, Y Pan - Journal of Parallel and Distributed Computing, 2024 - Elsevier
Given the distinctive characteristics of flash-based solid-state drives (SSDs), such as out-of-
place update scheme, as compared to traditional block storage devices, a flash translation …

Exploring large all-flash storage system with scientific simulation

J Gu, G Eisenhauer, S Klasky, N Podhorszki… - Proceedings of the 34th …, 2022 - dl.acm.org
Solid state storage systems have been very effectively used in small devices; however, their
effectiveness for large systems such as supercomputers is not yet proven. Recently, for the …

Adaptively periodic I/O scheduling for concurrent HPC applications

B Zha, H Shen - Electronics, 2022 - mdpi.com
With the convergence of big data and HPC (high-performance computing), various machine
learning applications and traditional large-scale simulations with a stochastically iterative I/O …

Design and Implementation of Burst Buffer Over-Subscription Scheme for HPC Storage Systems

J Bang, A Sim, GK Lockwood, H Eom, H Sung - IEEE Access, 2023 - ieeexplore.ieee.org
Burst Buffer is widely used in supercomputer centers to bridge the performance gap
between computational power and the high-performance I/O systems. The primary role of …

Towards zero-waste recovery and zero-overhead checkpointing in ensemble data assimilation

K Keller, AC Kestelman… - 2021 IEEE 28th …, 2021 - ieeexplore.ieee.org
Ensemble data assimilation is a powerful tool for increasing the accuracy of climatological
states. It is based on combining observations with the results from numerical model …