Parallel programming models for heterogeneous many-cores: a comprehensive survey

J Fang, C Huang, T Tang, Z Wang - CCF Transactions on High …, 2020 - Springer
Heterogeneous many-cores are now an integral part of modern computing systems ranging
from embedding systems to supercomputers. While heterogeneous many-core design offers …

Graviton: Trusted execution environments on {GPUs}

S Volos, K Vaswani, R Bruno - 13th USENIX Symposium on Operating …, 2018 - usenix.org
We propose Graviton, an architecture for supporting trusted execution environments on
GPUs. Graviton enables applications to offload security-and performance-sensitive kernels …

Multi2Sim: A simulation framework for CPU-GPU computing

R Ubal, B Jang, P Mistry, D Schaa, D Kaeli - Proceedings of the 21st …, 2012 - dl.acm.org
Accurate simulation is essential for the proper design and evaluation of any computing
platform. Upon the current move toward the CPU-GPU heterogeneous computing era …

Duality cache for data parallel acceleration

D Fujiki, S Mahlke, R Das - … of the 46th International Symposium on …, 2019 - dl.acm.org
Duality Cache is an in-cache computation architecture that enables general purpose data
parallel applications to run on caches. This paper presents a holistic approach of building …

Gdev:{First-Class}{GPU} Resource Management in the Operating System

S Kato, M McThrow, C Maltzahn, S Brandt - 2012 USENIX Annual …, 2012 - usenix.org
Graphics processing units (GPUs) have become a very powerful platform embracing a
concept of heterogeneous many-core computing. However, application domains of GPUs …

A performance analysis framework for identifying potential benefits in GPGPU applications

J Sim, A Dasgupta, H Kim, R Vuduc - Proceedings of the 17th ACM …, 2012 - dl.acm.org
Tuning code for GPGPU and other emerging many-core platforms is a challenge because
few models or tools can precisely pinpoint the root cause of performance bottlenecks. In this …

A survey on techniques for cooperative CPU-GPU computing

K Raju, NN Chiplunkar - Sustainable Computing: Informatics and Systems, 2018 - Elsevier
Abstract Graphical Processing Unit provides massive parallelism due to the presence of
hundreds of cores. Usage of GPUs for general purpose computation (GPGPU) has resulted …

GKLEE: concolic verification and test generation for GPUs

G Li, P Li, G Sawaya, G Gopalakrishnan… - Proceedings of the 17th …, 2012 - dl.acm.org
Programs written for GPUs often contain correctness errors such as races, deadlocks, or
may compute the wrong result. Existing debugging tools often miss these errors because of …

Transparent CPU-GPU collaboration for data-parallel kernels on heterogeneous systems

J Lee, M Samadi, Y Park… - Proceedings of the 22nd …, 2013 - ieeexplore.ieee.org
Heterogeneous computing on CPUs and GPUs has traditionally used fixed roles for each
device: the GPU handles data parallel work by taking advantage of its massive number of …

OCCA: A unified approach to multi-threading languages

DS Medina, A St-Cyr, T Warburton - arXiv preprint arXiv:1403.0968, 2014 - arxiv.org
The inability to predict lasting languages and architectures led us to develop OCCA, a C++
library focused on host-device interaction. Using run-time compilation and macro …