GPU-accelerated molecular dynamics: State-of-art software performance and porting from Nvidia CUDA to AMD HIP
N Kondratyuk, V Nikolskiy, D Pavlov… - … Journal of High …, 2021 - journals.sagepub.com
Classical molecular dynamics (MD) calculations represent a significant part of the utilization
time of high-performance computing systems. As usual, the efficiency of such calculations is …
time of high-performance computing systems. As usual, the efficiency of such calculations is …
GPU algorithms for efficient exascale discretizations
In this paper we describe the research and development activities in the Center for Efficient
Exascale Discretization within the US Exascale Computing Project, targeting state-of-the-art …
Exascale Discretization within the US Exascale Computing Project, targeting state-of-the-art …
Hmg: Extending cache coherence protocols across modern hierarchical multi-gpu systems
Prior work on GPU cache coherence has shown that simple hardware-or software-based
protocols can be more than sufficient. However, in recent years, features such as multi-chip …
protocols can be more than sufficient. However, in recent years, features such as multi-chip …
HipBone: A performance-portable graphics processing unit-accelerated C++ version of the NekBone benchmark
We present hipBone, an open-source performance-portable proxy application for the
Nek5000 (and NekRS) computational fluid dynamics applications. HipBone is a fully GPU …
Nek5000 (and NekRS) computational fluid dynamics applications. HipBone is a fully GPU …
Strong scaling of OpenACC enabled Nek5000 on several GPU based HPC systems
J Vincent, J Gong, M Karp, A Peplinski… - … Conference on High …, 2022 - dl.acm.org
We present new results on the strong parallel scaling for the OpenACC-accelerated
implementation of the high-order spectral element fluid dynamics solver Nek5000. The test …
implementation of the high-order spectral element fluid dynamics solver Nek5000. The test …
Tools for top-down performance analysis of GPU-accelerated applications
K Zhou, MW Krentel, J Mellor-Crummey - Proceedings of the 34th ACM …, 2020 - dl.acm.org
This paper describes extensions to Rice University's HPCToolkit performance tools to
support measurement and analysis of GPU-accelerated applications. To help developers …
support measurement and analysis of GPU-accelerated applications. To help developers …
On the performance portability of OpenACC, OpenMP, Kokkos and RAJA
A Marowka - … Conference on High Performance Computing in Asia …, 2022 - dl.acm.org
Performance Portability frameworks are becoming more central and essential in
heterogeneous computing systems. However, the developer toolbox lacks the tools to …
heterogeneous computing systems. However, the developer toolbox lacks the tools to …
Exploring the acceleration of Nekbone on reconfigurable architectures
N Brown - 2020 IEEE/ACM International Workshop on …, 2020 - ieeexplore.ieee.org
Hardware technological advances are struggling to match scientific ambition, and a key
question is how we can use the transistors that we already have more effectively. This is …
question is how we can use the transistors that we already have more effectively. This is …
High-performance spectral element methods on field-programmable gate arrays: implementation, evaluation, and future projection
Improvements in computer systems have historically relied on two well-known observations:
Moore's law and Dennard's scaling. Today, both these observations are ending, forcing …
Moore's law and Dennard's scaling. Today, both these observations are ending, forcing …
Accelerated CPU–GPUs implementations for quaternion polar harmonic transform of color images
Image moments are used to capture image features. Moments are successfully used in
object descriptions, recognition, and other applications. However, image representation and …
object descriptions, recognition, and other applications. However, image representation and …