Optimization techniques for GPU programming

P Hijma, S Heldens, A Sclocco… - ACM Computing …, 2023 - dl.acm.org
In the past decade, Graphics Processing Units have played an important role in the field of
high-performance computing and they still advance new fields such as IoT, autonomous …

Graph processing on GPUs: A survey

X Shi, Z Zheng, Y Zhou, H Jin, L He, B Liu… - ACM Computing Surveys …, 2018 - dl.acm.org
In the big data era, much real-world data can be naturally represented as graphs.
Consequently, many application domains can be modeled as graph processing. Graph …

A scalable processing-in-memory accelerator for parallel graph processing

J Ahn, S Hong, S Yoo, O Mutlu, K Choi - Proceedings of the 42nd Annual …, 2015 - dl.acm.org
The explosion of digital data and the ever-growing need for fast data analysis have made in-
memory big-data processing in computer systems increasingly important. In particular, large …

GCNAX: A flexible and energy-efficient accelerator for graph convolutional neural networks

J Li, A Louri, A Karanth… - 2021 IEEE International …, 2021 - ieeexplore.ieee.org
Graph convolutional neural networks (GCNs) have emerged as an effective approach to
extend deep learning for graph data analytics. Given that graphs are usually irregular, as …

Scalable GPU graph traversal

D Merrill, M Garland, A Grimshaw - ACM Sigplan Notices, 2012 - dl.acm.org
Breadth-first search (BFS) is a core primitive for graph traversal and a basis for many higher-
level graph analysis algorithms. It is also representative of a class of parallel computations …

A quantitative study of irregular programs on GPUs

M Burtscher, R Nasre, K Pingali - 2012 IEEE International …, 2012 - ieeexplore.ieee.org
GPUs have been used to accelerate many regular applications and, more recently, irregular
applications in which the control flow and memory access patterns are data-dependent and …

CuSha: vertex-centric graph processing on GPUs

F Khorasani, K Vora, R Gupta, LN Bhuyan - Proceedings of the 23rd …, 2014 - dl.acm.org
Vertex-centric graph processing is employed by many popular algorithms (eg, PageRank)
due to its simplicity and efficient use of asynchronous parallelism. The high compute power …

Green-Marl: a DSL for easy and efficient graph analysis

S Hong, H Chafi, E Sedlar, K Olukotun - Proceedings of the seventeenth …, 2012 - dl.acm.org
The increasing importance of graph-data based applications is fueling the need for highly
efficient and parallel implementations of graph analysis software. In this paper we describe …

Energy efficient architecture for graph analytics accelerators

MM Ozdal, S Yesil, T Kim, A Ayupov, J Greth… - ACM SIGARCH …, 2016 - dl.acm.org
Specialized hardware accelerators can significantly improve the performance and power
efficiency of compute systems. In this paper, we focus on hardware accelerators for graph …

Medusa: Simplified graph processing on GPUs

J Zhong, B He - IEEE Transactions on Parallel and Distributed …, 2013 - ieeexplore.ieee.org
Graphs are common data structures for many applications, and efficient graph processing is
a must for application performance. Recently, the graphics processing unit (GPU) has been …