A survey and classification of software-defined storage systems

R Macedo, J Paulo, J Pereira, A Bessani - ACM Computing Surveys …, 2020 - dl.acm.org
The exponential growth of digital information is imposing increasing scale and efficiency
demands on modern storage infrastructures. As infrastructure complexity increases, so does …

Sizing and partitioning strategies for burst-buffers to reduce io contention

G Aupy, O Beaumont… - 2019 IEEE international …, 2019 - ieeexplore.ieee.org
Burst-Buffers are high throughput and small size storage which are being used as an
intermediate storage between the PFS (Parallel File System) and the computational nodes …

Mapping and scheduling HPC applications for optimizing I/O

J Carretero, E Jeannot, G Pallez, DE Singh… - Proceedings of the 34th …, 2020 - dl.acm.org
In HPC platforms, concurrent applications are sharing the same file system. This can lead to
conflicts, especially as applications are more and more data intensive. I/O contention can …

Analysis and correlation of application I/O performance and system-wide I/O activity

S Madireddy, P Balaprakash, P Carns… - … , and Storage (NAS), 2017 - ieeexplore.ieee.org
Storage resources in high-performance computing are shared across all user applications.
Consequently, storage performance can vary markedly, depending not only on an …

The Case for Storage Optimization Decoupling in Deep Learning Frameworks

R Macedo, C Correia, M Dantas, C Brito… - 2021 IEEE …, 2021 - ieeexplore.ieee.org
Deep Learning (DL) training requires efficient access to large collections of data, leading DL
frameworks to implement individual I/O optimizations to take full advantage of storage …

Limitless—light-weight monitoring tool for large scale systems

A Cascajo, DE Singh, J Carretero - Microprocessors and Microsystems, 2022 - Elsevier
This work presents LIMITLESS, a HPC framework that provides new strategies for
monitoring clusters. LIMITLESS is a scalable light-weight monitor that is integrated with other …

Data Flow Lifecycles for Optimizing Workflow Coordination

H Lee, L Guo, M Tang, J Firoz, N Tallent… - Proceedings of the …, 2023 - dl.acm.org
A critical performance challenge in distributed scientific workflows is coordinating tasks and
data flows on distributed resources. To guide these decisions, this paper introduces data …

Scheduling periodic I/O access with bi-colored chains: models and algorithms

E Jeannot, G Pallez, N Vidal - Journal of Scheduling, 2021 - Springer
Observations show that some HPC applications periodically alternate between (i) operations
(computations, local data accesses) executed on the compute nodes, and (ii) I/O transfers of …

NORNS: extending Slurm to support data-driven workflows through asynchronous data staging

A Miranda, A Jackson, T Tocci… - … on Cluster Computing …, 2019 - ieeexplore.ieee.org
As HPC systems move into the Exascale era, parallel file systems are struggling to keep up
with the I/O requirements from data-intensive problems. While the inclusion of burst buffers …

IO-aware Job-Scheduling: Exploiting the Impacts of Workload Characterizations to select the Mapping Strategy

E Jeannot, G Pallez, N Vidal - The International Journal of …, 2023 - journals.sagepub.com
In high performance, computing concurrent applications are sharing the same file system.
However, the bandwidth which provides access to the storage is limited. Therefore, too …