Hybrid CUDA, OpenMP, and MPI parallel programming on multicore GPU clusters
CT Yang, CL Huang, CF Lin - Computer Physics Communications, 2011 - Elsevier
Nowadays, NVIDIA's CUDA is a general purpose scalable parallel programming model for
writing highly parallel applications. It provides several key abstractions–a hierarchy of thread …
writing highly parallel applications. It provides several key abstractions–a hierarchy of thread …
Accelerating incompressible flow computations with a Pthreads-CUDA implementation on small-footprint multi-GPU platforms
JC Thibault, I Senocak - The Journal of Supercomputing, 2012 - Springer
Graphics processor units (GPU) that are originally designed for graphics rendering have
emerged as massively-parallel “co-processors” to the central processing unit (CPU). Small …
emerged as massively-parallel “co-processors” to the central processing unit (CPU). Small …
Virtualization of reconfigurable coprocessors in HPRC systems with multicore architecture
HPRC (High-Performance Reconfigurable Computing) systems include multicore
processors and reconfigurable devices acting as custom coprocessors. Due to economic …
processors and reconfigurable devices acting as custom coprocessors. Due to economic …
A parallel hybrid implementation of the 2D acoustic wave equation
A Altybay, M Ruzhansky… - International Journal of …, 2020 - degruyter.com
In this paper, we propose a hybrid parallel programming approach for a numerical solution
of a two-dimensional acoustic wave equation using an implicit difference scheme for a …
of a two-dimensional acoustic wave equation using an implicit difference scheme for a …
Hybrid parallel programming on GPU clusters
CT Yang, CL Huang, CF Lin… - … Symposium on Parallel …, 2010 - ieeexplore.ieee.org
Nowadays, NVIDIA's CUDA is a general purpose scalable parallel programming model for
writing highly parallel applications. It provides several key abstractions-a hierarchy of thread …
writing highly parallel applications. It provides several key abstractions-a hierarchy of thread …
Multi-GPU and multi-CPU accelerated FDTD scheme for vibroacoustic applications
Abstract The Finite-Difference Time-Domain (FDTD) method is applied to the analysis of
vibroacoustic problems and to study the propagation of longitudinal and transversal waves …
vibroacoustic problems and to study the propagation of longitudinal and transversal waves …
Detecting point sources in CMB maps using an efficient parallel algorithm
Abstract The Cosmic Microwave Background (CMB) is a diffuse radiation which is
contaminated by the radiation emitted by point sources. The precise knowledge of CMB …
contaminated by the radiation emitted by point sources. The precise knowledge of CMB …
Significance of Parallel Computation over Serial Computation Using OpenMP, MPI, and CUDA
The need of fast computers to perform multiple works simultaneously in less time is
increasing day by day. In serial computation, tasks are performed one by one which takes …
increasing day by day. In serial computation, tasks are performed one by one which takes …
Two-stage distributed parallel algorithm with message passing interface for maximum flow problem
J Jiang, L Wu - The Journal of Supercomputing, 2015 - Springer
Maximum flow is one of the important and classical combinatorial optimization problems.
However, the time complexity of sequential maximum flow algorithms remains high. In this …
However, the time complexity of sequential maximum flow algorithms remains high. In this …
OpenMP based Action Entropy Active Sensing in Cloud Computing
Y Liu - 2020 - rave.ohiolink.edu
Action entropy active sensing is a newly proposed task-oriented active sensing method that
selects the sensing action by minimizing the uncertainty in the future task. When the number …
selects the sensing action by minimizing the uncertainty in the future task. When the number …