A survey of CPU-GPU heterogeneous computing techniques
As both CPUs and GPUs become employed in a wide range of applications, it has been
acknowledged that both of these Processing Units (PUs) have their unique features and …
acknowledged that both of these Processing Units (PUs) have their unique features and …
[PDF][PDF] Embracing diversity in the Barrelfish manycore operating system
We discuss diversity and heterogeneity in manycore computer systems, and identify three
distinct types of diversity, all of which present challenges to operating system designers and …
distinct types of diversity, all of which present challenges to operating system designers and …
An efficient, model-based CPU-GPU heterogeneous FFT library
Y Ogata, T Endo, N Maruyama… - 2008 IEEE international …, 2008 - ieeexplore.ieee.org
General-Purpose computing on Graphics Processing Units (GPGPU) is becoming popular in
HPC because of its high peak performance. However, in spite of the potential performance …
HPC because of its high peak performance. However, in spite of the potential performance …
Energy-efficient acceleration of deep neural networks on realtime-constrained embedded edge devices
This paper presents a hardware management technique that enables energy-efficient
acceleration of deep neural networks (DNNs) on realtime-constrained embedded edge …
acceleration of deep neural networks (DNNs) on realtime-constrained embedded edge …
Optimization of sparse matrix-vector multiplication with variant CSR on GPUs
X Feng, H Jin, R Zheng, K Hu, J Zeng… - 2011 IEEE 17th …, 2011 - ieeexplore.ieee.org
Sparse Matrix-Vector multiplication (SpMV) is one of the most significant yet challenging
issues in computational science area. It is a memory-bound application whose performance …
issues in computational science area. It is a memory-bound application whose performance …
Processing data streams with hard real-time constraints on heterogeneous systems
U Verner, A Schuster, M Silberstein - Proceedings of the international …, 2011 - dl.acm.org
Data stream processing applications such as stock exchange data analysis, VoIP streaming,
and sensor data processing pose two conflicting challenges: short per-stream latency--to …
and sensor data processing pose two conflicting challenges: short per-stream latency--to …
Matrix multiplication on high-density multi-GPU architectures: theoretical and experimental investigations
Matrix multiplication (MM) is one of the core problems in the high performance computing
domain and its efficiency impacts performances of almost all matrix problems. The high …
domain and its efficiency impacts performances of almost all matrix problems. The high …
Optimization of quasi-diagonal matrix–vector multiplication on GPU
W Yang, K Li, Y Liu, L Shi… - The international journal …, 2014 - journals.sagepub.com
Sparse matrix–vector multiplication (SpMV) is of singular importance in sparse linear
algebra, which is an important issue in scientific computing and engineering practice. Much …
algebra, which is an important issue in scientific computing and engineering practice. Much …
An efficient GPU implementation of the revised simplex method
J Bieling, P Peschlow, P Martini - 2010 IEEE International …, 2010 - ieeexplore.ieee.org
The computational power provided by the massive parallelism of modern graphics
processing units (GPUs) has moved increasingly into focus over the past few years. In …
processing units (GPUs) has moved increasingly into focus over the past few years. In …
Flexi-BOPI: Flexible Granularity Pipeline Inference with Bayesian Optimization for Deep Learning Models on HMPSoC
To achieve high-throughput deep learning (DL) model inference on heterogeneous
multiprocessor systems-on-chip (HMPSoC) platforms, the use of pipelining for the …
multiprocessor systems-on-chip (HMPSoC) platforms, the use of pipelining for the …