Scalable Tuning of (OpenMP) GPU Applications via Kernel Record and Replay
HPC is a heterogeneous world in which host and device code are interleaved throughout
the application. Given the significant performance advantage of accelerators, device code …
the application. Given the significant performance advantage of accelerators, device code …
Thermo4PFM: Facilitating phase-field simulations of alloys with thermodynamic driving forces
Phase-field modeling is a popular front-tracking approach used to model solidification. Its
time-evolution equations are often coupled to alloy composition and/or thermal diffusion in …
time-evolution equations are often coupled to alloy composition and/or thermal diffusion in …
Direct GPU compilation and execution for host applications with OpenMP Parallelism
Currently, offloading to accelerators requires users to identify which regions are to be
executed on the device, what memory needs to be transferred, and how synchronization is …
executed on the device, what memory needs to be transferred, and how synchronization is …
Experiences building an mlir-based sycl compiler
Similar to other programming models, compilers for SYCL, the open programming model for
heterogeneous computing based on C++, would benefit from access to higher-level …
heterogeneous computing based on C++, would benefit from access to higher-level …
GPU First--Execution of Legacy CPU Codes on GPUs
Utilizing GPUs is critical for high performance on heterogeneous systems. However,
leveraging the full potential of GPUs for accelerating legacy CPU applications can be a …
leveraging the full potential of GPUs for accelerating legacy CPU applications can be a …
Advancing the state of the art of directive-based programming for GPUs: runtime and compilation
K Matsumura - 2024 - upcommons.upc.edu
(English) The rapid development in computing technology has paved the way for directive-
based programming models towards a principal role in maintaining software portability of …
based programming models towards a principal role in maintaining software portability of …
Efficient Development and Execution of OpenMP on GPUs
S Tian - 2023 - search.proquest.com
OpenMP has long been the preferred choice for CPU parallelism in High-Performance
Computing (HPC) applications written in C/C++/Fortran. With the increasing prevalence of …
Computing (HPC) applications written in C/C++/Fortran. With the increasing prevalence of …