The building blocks of a brain-inspired computer

JD Kendall, S Kumar - Applied Physics Reviews, 2020 - pubs.aip.org
Computers have undergone tremendous improvements in performance over the last 60
years, but those improvements have significantly slowed down over the last decade, owing …

Scalable community detection with the louvain algorithm

X Que, F Checconi, F Petrini… - 2015 IEEE international …, 2015 - ieeexplore.ieee.org
In this paper we present and evaluate a parallel community detection algorithm derived from
the state-of-the-art Louvain modularity maximization method. Our algorithm adopts a novel …

Scalable single source shortest path algorithms for massively parallel systems

VT Chakaravarthy, F Checconi, P Murali… - … on Parallel and …, 2016 - ieeexplore.ieee.org
We consider the single-source shortest path (SSSP) problem: given an undirected graph
with integer edge weights and a source vertex, find the shortest paths from to all other …

High-performance and scalable GPU graph traversal

D Merrill, M Garland, A Grimshaw - ACM Transactions on Parallel …, 2015 - dl.acm.org
Breadth-First Search (BFS) is a core primitive for graph traversal and a basis for many higher-
level graph analysis algorithms. It is also representative of a class of parallel computations …

Shentu: processing multi-trillion edge graphs on millions of cores in seconds

H Lin, X Zhu, B Yu, X Tang, W Xue… - … Conference for High …, 2018 - ieeexplore.ieee.org
Graphs are an important abstraction used in many scientific fields. With the magnitude of
graph-structured data constantly increasing, effective data analytics requires efficient and …

Multi-GPU graph analytics

Y Pan, Y Wang, Y Wu, C Yang… - 2017 IEEE International …, 2017 - ieeexplore.ieee.org
We present a single-node, multi-GPU programmable graph processing library that allows
programmers to easily extend single-GPU graph algorithms to achieve scalable …

Partitioning trillion-edge graphs in minutes

GM Slota, S Rajamanickam, K Devine… - 2017 IEEE …, 2017 - ieeexplore.ieee.org
We introduce XtraPuLP, a new distributed-memory graph partitioner designed to process
trillion-edge graphs. XtraPuLP is based on the scalable label propagation community …

G-store: high-performance graph store for trillion-edge processing

P Kumar, HH Huang - SC'16: Proceedings of the International …, 2016 - ieeexplore.ieee.org
High-performance graph processing brings great benefits to a wide range of scientific
applications, eg, biology networks, recommendation systems, and social networks, where …

Scaling graph traversal to 281 trillion edges with 40 million cores

H Cao, Y Wang, H Wang, H Lin, Z Ma, W Yin… - Proceedings of the 27th …, 2022 - dl.acm.org
Graph processing, especially high-performance graph traversal, plays a more and more
important role in data analytics. The successor of Sunway TaihuLight, New Sunway, is …

Slim graph: Practical lossy graph compression for approximate graph processing, storage, and analytics

M Besta, S Weber, L Gianinazzi… - Proceedings of the …, 2019 - dl.acm.org
We propose Slim Graph: the first programming model and framework for practical lossy
graph compression that facilitates high-performance approximate graph processing …