Hybrid CUDA, OpenMP, and MPI parallel programming on multicore GPU clusters

CT Yang, CL Huang, CF Lin - Computer Physics Communications, 2011 - Elsevier
Nowadays, NVIDIA's CUDA is a general purpose scalable parallel programming model for
writing highly parallel applications. It provides several key abstractions–a hierarchy of thread …

Accelerating incompressible flow computations with a Pthreads-CUDA implementation on small-footprint multi-GPU platforms

JC Thibault, I Senocak - The Journal of Supercomputing, 2012 - Springer
Graphics processor units (GPU) that are originally designed for graphics rendering have
emerged as massively-parallel “co-processors” to the central processing unit (CPU). Small …

Virtualization of reconfigurable coprocessors in HPRC systems with multicore architecture

I Gonzalez, S Lopez-Buedo, G Sutter… - Journal of Systems …, 2012 - Elsevier
HPRC (High-Performance Reconfigurable Computing) systems include multicore
processors and reconfigurable devices acting as custom coprocessors. Due to economic …

A parallel hybrid implementation of the 2D acoustic wave equation

A Altybay, M Ruzhansky… - International Journal of …, 2020 - degruyter.com
In this paper, we propose a hybrid parallel programming approach for a numerical solution
of a two-dimensional acoustic wave equation using an implicit difference scheme for a …

Hybrid parallel programming on GPU clusters

CT Yang, CL Huang, CF Lin… - … Symposium on Parallel …, 2010 - ieeexplore.ieee.org
Nowadays, NVIDIA's CUDA is a general purpose scalable parallel programming model for
writing highly parallel applications. It provides several key abstractions-a hierarchy of thread …

Multi-GPU and multi-CPU accelerated FDTD scheme for vibroacoustic applications

J Francés, B Otero, S Bleda, S Gallego, C Neipp… - Computer Physics …, 2015 - Elsevier
Abstract The Finite-Difference Time-Domain (FDTD) method is applied to the analysis of
vibroacoustic problems and to study the propagation of longitudinal and transversal waves …

Detecting point sources in CMB maps using an efficient parallel algorithm

P Alonso, F Argüeso, R Cortina, J Ranilla… - Journal of Mathematical …, 2012 - Springer
Abstract The Cosmic Microwave Background (CMB) is a diffuse radiation which is
contaminated by the radiation emitted by point sources. The precise knowledge of CMB …

Significance of Parallel Computation over Serial Computation Using OpenMP, MPI, and CUDA

S Rastogi, H Zaheer - Quality, IT and Business Operations: Modeling and …, 2018 - Springer
The need of fast computers to perform multiple works simultaneously in less time is
increasing day by day. In serial computation, tasks are performed one by one which takes …

Two-stage distributed parallel algorithm with message passing interface for maximum flow problem

J Jiang, L Wu - The Journal of Supercomputing, 2015 - Springer
Maximum flow is one of the important and classical combinatorial optimization problems.
However, the time complexity of sequential maximum flow algorithms remains high. In this …

OpenMP based Action Entropy Active Sensing in Cloud Computing

Y Liu - 2020 - rave.ohiolink.edu
Action entropy active sensing is a newly proposed task-oriented active sensing method that
selects the sensing action by minimizing the uncertainty in the future task. When the number …