Ge-spmm: General-purpose sparse matrix-matrix multiplication on gpus for graph neural networks
The acceleration of Graph Neural Networks (GNNs) requires efficient and framework-
compatible Sparse-Dense Matrix-Matrix Multiplication (SpMM). From the compatibility …
compatible Sparse-Dense Matrix-Matrix Multiplication (SpMM). From the compatibility …
Adaptive sparse tiling for sparse matrix multiplication
Tiling is a key technique for data locality optimization and is widely used in high-
performance implementations of dense matrix-matrix multiplication for multicore/manycore …
performance implementations of dense matrix-matrix multiplication for multicore/manycore …
Coarsening the granularity: Towards structurally sparse lottery tickets
The lottery ticket hypothesis (LTH) has shown that dense models contain highly sparse
subnetworks (ie, winning tickets) that can be trained in isolation to match full accuracy …
subnetworks (ie, winning tickets) that can be trained in isolation to match full accuracy …
Sparsep: Towards efficient sparse matrix vector multiplication on real processing-in-memory architectures
Several manufacturers have already started to commercialize near-bank Processing-In-
Memory (PIM) architectures, after decades of research efforts. Near-bank PIM architectures …
Memory (PIM) architectures, after decades of research efforts. Near-bank PIM architectures …
TileSpGEMM: A tiled algorithm for parallel sparse general matrix-matrix multiplication on GPUs
Sparse general matrix-matrix multiplication (SpGEMM) is one of the most fundamental
building blocks in sparse linear solvers, graph processing frameworks and machine learning …
building blocks in sparse linear solvers, graph processing frameworks and machine learning …
Towards efficient sparse matrix vector multiplication on real processing-in-memory architectures
Several manufacturers have already started to commercialize near-bank Processing-In-
Memory (PIM) architectures, after decades of research efforts. Near-bank PIM architectures …
Memory (PIM) architectures, after decades of research efforts. Near-bank PIM architectures …
Smash: Co-designing software compression and hardware-accelerated indexing for efficient sparse matrix operations
Important workloads, such as machine learning and graph analytics applications, heavily
involve sparse linear algebra operations. These operations use sparse matrix compression …
involve sparse linear algebra operations. These operations use sparse matrix compression …
Design principles for sparse matrix multiplication on the gpu
We implement two novel algorithms for sparse-matrix dense-matrix multiplication (SpMM) on
the GPU. Our algorithms expect the sparse input in the popular compressed-sparse-row …
the GPU. Our algorithms expect the sparse input in the popular compressed-sparse-row …
Sparsert: Accelerating unstructured sparsity on gpus for deep learning inference
Z Wang - Proceedings of the ACM international conference on …, 2020 - dl.acm.org
In recent years, there has been a flurry of research in deep neural network pruning and
compression. Early approaches prune weights individually. However, it is difficult to take …
compression. Early approaches prune weights individually. However, it is difficult to take …
Accel-gcn: High-performance gpu accelerator design for graph convolution networks
Graph Convolutional Networks (GCNs) are pivotal in extracting latent information from graph
data across various domains, yet their acceleration on mainstream GPUs is challenged by …
data across various domains, yet their acceleration on mainstream GPUs is challenged by …