Parallel programming with migratable objects: Charm++ in practice

B Acun, A Gupta, N Jain, A Langer… - SC'14: Proceedings …, 2014 - ieeexplore.ieee.org
The advent of petascale computing has introduced new challenges (eg Heterogeneity,
system failure) for programming scalable parallel applications. Increased complexity and …

Maximizing throughput of overprovisioned hpc data centers under a strict power budget

O Sarood, A Langer, A Gupta… - SC'14: Proceedings of the …, 2014 - ieeexplore.ieee.org
Building future generation supercomputers while constraining their power consumption is
one of the biggest challenges faced by the HPC community. For example, US Department of …

Open problems in queueing theory inspired by datacenter computing

M Harchol-Balter - Queueing Systems, 2021 - Springer
Datacenter operations today provide a plethora of new queueing and scheduling problems.
The notion of a “job” has become more general and multi-dimensional. The ways in which …

A batch system with efficient adaptive scheduling for malleable and evolving applications

S Prabhakaran, M Neumann, S Rinke… - 2015 IEEE …, 2015 - ieeexplore.ieee.org
The throughput of supercomputers depends not only on efficient job scheduling but also on
the type of jobs that form the workload. Malleable jobs are most favourable for a cluster as …

Enhancing the performance of malleable MPI applications by using performance-aware dynamic reconfiguration

G Martín, DE Singh, MC Marinescu, J Carretero - Parallel Computing, 2015 - Elsevier
The work in this paper focuses on providing malleability to MPI applications by using a novel
performance-aware dynamic reconfiguration technique. This paper describes the design …

Towards realizing the potential of malleable jobs

A Gupta, B Acun, O Sarood… - 2014 21st International …, 2014 - ieeexplore.ieee.org
Malleable jobs are those which can dynamically shrink or expand the number of processors
on which they are executing at runtime in response to an external command. Malleable jobs …

Dynamic resource allocation for efficient parallel CFD simulations

G Houzeaux, RM Badia, R Borrell, D Dosimont… - Computers & …, 2022 - Elsevier
CFD users of supercomputers usually resort to rule-of-thumb methods to select the number
of subdomains (partitions) when relying on MPI-based parallelization. One common …

Drom: Enabling efficient and effortless malleability for resource managers

M D'Amico, M Garcia-Gasulla, V López… - … Proceedings of the …, 2018 - dl.acm.org
In the design of future HPC systems, research in resource management is showing an
increasing interest in a more dynamic control of the available resources. It has been proven …

A batch system with fair scheduling for evolving applications

S Prabhakaran, M Iqbal, S Rinke… - 2014 43rd …, 2014 - ieeexplore.ieee.org
Cluster batch systems usually support only static allocation of resources to applications
before job start. After job start, applications cannot increase or decrease their resource set …

Holistic slowdown driven scheduling and resource management for malleable jobs

M D'Amico, A Jokanovic, J Corbalan - Proceedings of the 48th …, 2019 - dl.acm.org
In job scheduling, the concept of malleability has been explored since many years ago.
Research shows that malleability improves system performance, but its utilization in HPC …