Implementing OpenSHMEM using MPI-3 one-sided communication
This paper reports the design and implementation of Open-SHMEM over MPI using new one-
sided communication features in MPI-3, which include not only new functions (eg remote …
sided communication features in MPI-3, which include not only new functions (eg remote …
A comprehensive performance evaluation of OpenSHMEM libraries on InfiniBand clusters
OpenSHMEM is an open standard that brings together several long-standing, vendor-
specific SHMEM implementations that allows applications to use SHMEM in a platform …
specific SHMEM implementations that allows applications to use SHMEM in a platform …
Proxy-equation paradigm: A strategy for massively parallel asynchronous computations
A Mittal, S Girimaji - Physical Review E, 2017 - APS
Massively parallel simulations of transport equation systems call for a paradigm change in
algorithm development to achieve efficient scalability. Traditional approaches require time …
algorithm development to achieve efficient scalability. Traditional approaches require time …
Maximizing application performance in a multi-core, NUMA-aware compute cluster by multi-level tuning
G Shainer, P Lui, M Hilgeman, J Layton… - … Conference, ISC 2013 …, 2013 - Springer
Achieving good application performance on a modern compute cluster of multi-core, multi-
socket, NUMA-aware systems can be challenging. In this paper, we use VASP, a popular ab …
socket, NUMA-aware systems can be challenging. In this paper, we use VASP, a popular ab …
Early evaluation of scalable fabric interface for PGAS programming models
M Luo, K Seager, KS Murthy, CJ Archer, S Sur… - Proceedings of the 8th …, 2014 - dl.acm.org
Inter-processor communication is a critical factor for performance at scale. In order to
achieve good performance, communication overheads should be minimized. The fabric …
achieve good performance, communication overheads should be minimized. The fabric …
[PDF][PDF] Optimal partitioning for parallel matrix computation on a small number of abstract heterogeneous processors
A DeFlumere - 2014 - 137.43.92.117
Abstract High Performance Computing (HPC) has grown to encompass many new
architectures and algorithms. The Top500 list, which ranks the world's fastest …
architectures and algorithms. The Top500 list, which ranks the world's fastest …
Effects of Processor-Native Memory Transactions in Optimizing RDMA Transfers in Distributed Shared Memory Systems
K Paraskevas - 2021 - search.proquest.com
Reducing latency and increasing the throughput of issued data transfers is a core
requirement if we are to meet the needs of future systems at scale, and therefore, fast …
requirement if we are to meet the needs of future systems at scale, and therefore, fast …
Analysing the influence of InfiniBand choice on OpenMPI memory consumption
O Perks, DA Beckingsale, AS Dawes… - … Conference on High …, 2013 - ieeexplore.ieee.org
The ever increasing scale of modern high performance computing platforms poses
challenges for system architects and code developers alike. The increase in core count …
challenges for system architects and code developers alike. The increase in core count …
'Proxy-equation'paradigm-A novel strategy for massively-parallel asynchronous computations
A Mittal, S Girimaji - arXiv preprint arXiv:1611.04985, 2016 - arxiv.org
Massively parallel simulations of transport equation systems call for a paradigm change in
algorithm development to achieve efficient scalability. Traditional approaches require time …
algorithm development to achieve efficient scalability. Traditional approaches require time …
Temporal Reasoning in Medicine for Type 2 Diabetes Mellitus Patient Outcomes and Treatments Using Dynamic Bayesian Networks
RL Angell - 2018 - search.proquest.com
Medicine is the art and science of diagnosis and treatment of disease-maintenance of one's
health. Temporal reasoning in medicine is the art and practice of modeling one's …
health. Temporal reasoning in medicine is the art and practice of modeling one's …