Quantum-centric supercomputing for materials science: A perspective on challenges and future directions

Y Alexeev, M Amsler, MA Barroca, S Bassini… - Future Generation …, 2024 - Elsevier
Computational models are an essential tool for the design, characterization, and discovery
of novel materials. Computationally hard tasks in materials science stretch the limits of …

MIMD Programs Execution Support on SIMD Machines: A Holistic Survey

D Mustafa, R Alkhasawneh, F Obeidat… - IEEE Access, 2024 - ieeexplore.ieee.org
The Single Instruction Multiple Data (SIMD) architecture, supported by various high-
performance computing platforms, efficiently utilizes data-level parallelism. The SIMD model …

A performance analysis of modern parallel programming models using a compute-bound application

A Poenaru, WC Lin, S McIntosh-Smith - International Conference on High …, 2021 - Springer
Performance portability is becoming more-and-more important as next-generation high
performance computing systems grow increasingly diverse and heterogeneous. Several …

oneAPI open-source math library interface

M Krainiuk, M Goli, VR Pascuzzi - 2021 International Workshop …, 2021 - ieeexplore.ieee.org
To HPC and AI analytics engineers, math primitives such as basic linear algebra
subprograms or random number generators are key functionality that have highly optimized …

Under the hood of sycl–an initial performance analysis with an unstructured-mesh cfd application

IZ Reguly, AMB Owenson, A Powell, SA Jarvis… - … Conference, ISC High …, 2021 - Springer
As the computing hardware landscape gets more diverse, and the complexity of hardware
grows, the need for a general purpose parallel programming model capable of developing …

AXI4MLIR: User-Driven Automatic Host Code Generation for Custom AXI-Based Accelerators

NB Agostini, J Haris, P Gibson… - 2024 IEEE/ACM …, 2024 - ieeexplore.ieee.org
This paper addresses the need for automatic and efficient generation of host driver code for
arbitrary custom AXI-based accelerators targeting linear algebra algorithms, an important …

Performance portable Vlasov code with C++ parallel algorithm

Y Asahi, T Padioleau, G Latu, J Bigot… - 2022 IEEE/ACM …, 2022 - ieeexplore.ieee.org
This paper presents the performance portable implementation of a kinetic plasma simulation
code with C++ parallel algorithm to run across multiple CPUs and GPUs. Relying on the …

Sylkan: towards a Vulkan compute target platform for SYCL

P Thoman, D Gogl, T Fahringer - … of the 9th International Workshop on …, 2021 - dl.acm.org
SYCL is a modern high-level C++ programming interface which excels at expressing data
parallelism for heterogeneous hardware platforms in a programmer-friendly way, and is …

The celerity high-level api: C++ 20 for accelerator clusters

P Thoman, F Tischler, P Salzmann… - International Journal of …, 2022 - Springer
Providing convenient APIs and notations for data parallelism which remain accessible for
programmers while still providing good performance has been a long-term goal of …

[PDF][PDF] First SYCL implementation of the three-dimensional subsurface XCA-Flow cellular automaton and performance comparison against CUDA

D D'Ambrosio, G Terremoto, A De Rango… - Proceedings of the …, 2022 - computing-conf.org
We present the results of a first SYCL vs CUDA performance assessment for the case of the
three-dimensional XCA-Flow subsurface Extended Cellular Automata model. A grid domain …