Scalable Tuning of (OpenMP) GPU Applications via Kernel Record and Replay

K Parasyris, G Georgakoudis, E Rangel… - Proceedings of the …, 2023 - dl.acm.org
HPC is a heterogeneous world in which host and device code are interleaved throughout
the application. Given the significant performance advantage of accelerators, device code …

Thermo4PFM: Facilitating phase-field simulations of alloys with thermodynamic driving forces

JL Fattebert, S DeWitt, A Perron, J Turner - Computer Physics …, 2023 - Elsevier
Phase-field modeling is a popular front-tracking approach used to model solidification. Its
time-evolution equations are often coupled to alloy composition and/or thermal diffusion in …

Direct GPU compilation and execution for host applications with OpenMP Parallelism

S Tian, J Huber, K Parasyris… - 2022 IEEE/ACM …, 2022 - ieeexplore.ieee.org
Currently, offloading to accelerators requires users to identify which regions are to be
executed on the device, what memory needs to be transferred, and how synchronization is …

Experiences building an mlir-based sycl compiler

E Tiotto, V Pérez, W Tsang, L Sommer… - 2024 IEEE/ACM …, 2024 - ieeexplore.ieee.org
Similar to other programming models, compilers for SYCL, the open programming model for
heterogeneous computing based on C++, would benefit from access to higher-level …

GPU First--Execution of Legacy CPU Codes on GPUs

S Tian, T Scogland, B Chapman, J Doerfert - arXiv preprint arXiv …, 2023 - arxiv.org
Utilizing GPUs is critical for high performance on heterogeneous systems. However,
leveraging the full potential of GPUs for accelerating legacy CPU applications can be a …

Advancing the state of the art of directive-based programming for GPUs: runtime and compilation

K Matsumura - 2024 - upcommons.upc.edu
(English) The rapid development in computing technology has paved the way for directive-
based programming models towards a principal role in maintaining software portability of …

Efficient Development and Execution of OpenMP on GPUs

S Tian - 2023 - search.proquest.com
OpenMP has long been the preferred choice for CPU parallelism in High-Performance
Computing (HPC) applications written in C/C++/Fortran. With the increasing prevalence of …