Advanced thread synchronization for multithreaded MPI implementations

HV Dang, S Seo, A Amer… - 2017 17th IEEE/ACM …, 2017 - ieeexplore.ieee.org
Concurrent multithreaded access to the Message Passing Interface (MPI) is gaining
importance to support emerging hybrid MPI applications. The interoperability between …

Systemwide power management with Argo

D Ellsworth, T Patki, S Perarnau, S Seo… - 2016 IEEE …, 2016 - ieeexplore.ieee.org
The Argo project is a DOE initiative for designing a modular operating system/runtime for the
next generation of supercomputers. A key focus area in this project is power management …

An efficient abortable-locking protocol for multi-level NUMA systems

M Chabbi, A Amer, S Wen, X Liu - … on Principles and Practice of Parallel …, 2017 - dl.acm.org
The popularity of Non-Uniform Memory Access (NUMA) architectures has led to numerous
locality-preserving hierarchical lock designs, such as HCLH, HMCS, and cohort locks …

Software combining to mitigate multithreaded MPI contention

A Amer, C Archer, M Blocksome, C Cao… - Proceedings of the …, 2019 - dl.acm.org
Efforts to mitigate lock contention from concurrent threaded accesses to MPI have reduced
contention through fine-grained locking, avoided locking altogether by offloading …

Lock contention management in multithreaded mpi

A Amer, H Lu, P Balaji, M Chabbi, Y Wei… - ACM Transactions on …, 2019 - dl.acm.org
In this article, we investigate contention management in lock-based thread-safe MPI
libraries. Specifically, we make two assumptions:(1) locks are the only form of …

A Distributed Version of Syrup

G Audemard, JM Lagniez, N Szczepanski… - … Conference on Theory …, 2017 - Springer
A portfolio SAT solver has to share clauses in order to be efficient. In a distributed
environment, such sharing implies additional problems: more information has to be …

Level-synchronous BFS algorithm implemented in Java using PCJ library

M Ryczkowska, M Nowicki… - … on Computational Science …, 2016 - ieeexplore.ieee.org
Graph processing is used in many fields of science such as sociology, risk prediction or
biology. Although analysis of graphs is important it also poses numerous challenges …

Towards data-flow parallelization for adaptive mesh refinement applications

K Sala, A Rico, V Beltran - 2020 IEEE International Conference …, 2020 - ieeexplore.ieee.org
Adaptive Mesh Refinement (AMR) is a prevalent method used by distributed-memory
simulation applications to adapt the accuracy of their solutions depending on the turbulent …

Analyzing the performance trade-off in implementing user-level threads

S Iwasaki, A Amer, K Taura… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
User-level threads have been widely adopted as a means of achieving lightweight
concurrent execution without the costs of OS-level threads. Nevertheless, the costs of …

Lessons learned from analyzing dynamic promotion for user-level threading

S Iwasaki, A Amer, K Taura… - … Conference for High …, 2018 - ieeexplore.ieee.org
A performance vs. practicality trade-off exists between user-level threading techniques. The
community has settled mostly on a black-and-white perspective; fully fledged threads …