A survey on malleability solutions for high-performance distributed computing
Maintaining a high rate of productivity, in terms of completed jobs per unit of time, in High-
Performance Computing (HPC) facilities is a cornerstone in the next generation of exascale …
Performance Computing (HPC) facilities is a cornerstone in the next generation of exascale …
Dynamic spawning of MPI processes applied to malleability
Malleability allows computing facilities to adapt their workloads through resource
management systems to maximize the throughput of the facility and the efficiency of the …
management systems to maximize the throughput of the facility and the efficiency of the …
Transparent resource elasticity for task-based cluster environments with work stealing
J Posner, C Fohry - 50th International Conference on Parallel …, 2021 - dl.acm.org
Resource elasticity allows to dynamically change the resources of running jobs, which may
significantly improve the throughput on supercomputers. Elasticity requires support from …
significantly improve the throughput on supercomputers. Elasticity requires support from …
Malleable APGAS programs and their support in batch job schedulers
P Finnerty, L Takaoka, T Kanzaki, J Posner - European Conference on …, 2023 - Springer
Malleability—the ability for applications to dynamically adjust their resource allocations at
runtime—presents great potential to enhance the efficiency and resource utilization of …
runtime—presents great potential to enhance the efficiency and resource utilization of …
Enhancing supercomputer performance with malleable job scheduling strategies
J Posner, F Hupfeld, P Finnerty - European Conference on Parallel …, 2023 - Springer
In recent years, supercomputers have experienced significant advancements in
performance and have grown in size, now comprising several thousands nodes. To unlock …
performance and have grown in size, now comprising several thousands nodes. To unlock …
Proteo: a framework for the generation and evaluation of malleable MPI applications
Applying malleability to HPC systems can increase their productivity without degrading or
even improving the performance of running applications. This paper presents Proteo, a …
even improving the performance of running applications. This paper presents Proteo, a …
Scheduling of elastic message passing applications on hpc systems
DH Lina, S Ghafoor, T Hines - Workshop on Job Scheduling Strategies for …, 2022 - Springer
Elastic parallel applications that can change the number of processors while being executed
promise improved application and system performance, allow new classes of data and event …
promise improved application and system performance, allow new classes of data and event …
Evaluating Data Redistribution in PaRSEC
Data redistribution aims to reshuffle data to optimize some objective for an algorithm. The
objective can be multi-dimensional, such as improving computational load balance or …
objective can be multi-dimensional, such as improving computational load balance or …
An emulation layer for dynamic resources with MPI sessions
J Fecht, M Schreiber, M Schulz, H Pritchard… - … Conference on High …, 2022 - Springer
The current static job scheduling on supercomputers for MPI-based applications is well
known to be a limiting factor for the exploitation of a system's top performance in terms of …
known to be a limiting factor for the exploitation of a system's top performance in terms of …
Malleability in Modern HPC Systems: Current Experiences, Challenges, and Future Opportunities
With the increase of complex scientific simulations driven by workflows and heterogeneous
workload profiles, managing system resources effectively is essential for improving …
workload profiles, managing system resources effectively is essential for improving …