Ge-spmm: General-purpose sparse matrix-matrix multiplication on gpus for graph neural networks

G Huang, G Dai, Y Wang, H Yang - … Conference for High …, 2020 - ieeexplore.ieee.org
The acceleration of Graph Neural Networks (GNNs) requires efficient and framework-
compatible Sparse-Dense Matrix-Matrix Multiplication (SpMM). From the compatibility …

Adaptive sparse tiling for sparse matrix multiplication

C Hong, A Sukumaran-Rajam, I Nisa, K Singh… - Proceedings of the 24th …, 2019 - dl.acm.org
Tiling is a key technique for data locality optimization and is widely used in high-
performance implementations of dense matrix-matrix multiplication for multicore/manycore …

Coarsening the granularity: Towards structurally sparse lottery tickets

T Chen, X Chen, X Ma, Y Wang… - … conference on machine …, 2022 - proceedings.mlr.press
The lottery ticket hypothesis (LTH) has shown that dense models contain highly sparse
subnetworks (ie, winning tickets) that can be trained in isolation to match full accuracy …

Sparsep: Towards efficient sparse matrix vector multiplication on real processing-in-memory architectures

C Giannoula, I Fernandez, JG Luna, N Koziris… - Proceedings of the …, 2022 - dl.acm.org
Several manufacturers have already started to commercialize near-bank Processing-In-
Memory (PIM) architectures, after decades of research efforts. Near-bank PIM architectures …

TileSpGEMM: A tiled algorithm for parallel sparse general matrix-matrix multiplication on GPUs

Y Niu, Z Lu, H Ji, S Song, Z Jin, W Liu - Proceedings of the 27th ACM …, 2022 - dl.acm.org
Sparse general matrix-matrix multiplication (SpGEMM) is one of the most fundamental
building blocks in sparse linear solvers, graph processing frameworks and machine learning …

Towards efficient sparse matrix vector multiplication on real processing-in-memory architectures

C Giannoula, I Fernandez, J Gómez-Luna… - ACM SIGMETRICS …, 2022 - dl.acm.org
Several manufacturers have already started to commercialize near-bank Processing-In-
Memory (PIM) architectures, after decades of research efforts. Near-bank PIM architectures …

Smash: Co-designing software compression and hardware-accelerated indexing for efficient sparse matrix operations

K Kanellopoulos, N Vijaykumar, C Giannoula… - Proceedings of the …, 2019 - dl.acm.org
Important workloads, such as machine learning and graph analytics applications, heavily
involve sparse linear algebra operations. These operations use sparse matrix compression …

Design principles for sparse matrix multiplication on the gpu

C Yang, A Buluç, JD Owens - European Conference on Parallel …, 2018 - Springer
We implement two novel algorithms for sparse-matrix dense-matrix multiplication (SpMM) on
the GPU. Our algorithms expect the sparse input in the popular compressed-sparse-row …

Sparsert: Accelerating unstructured sparsity on gpus for deep learning inference

Z Wang - Proceedings of the ACM international conference on …, 2020 - dl.acm.org
In recent years, there has been a flurry of research in deep neural network pruning and
compression. Early approaches prune weights individually. However, it is difficult to take …

Accel-gcn: High-performance gpu accelerator design for graph convolution networks

X Xie, H Peng, A Hasan, S Huang… - 2023 IEEE/ACM …, 2023 - ieeexplore.ieee.org
Graph Convolutional Networks (GCNs) are pivotal in extracting latent information from graph
data across various domains, yet their acceleration on mainstream GPUs is challenged by …