CSX: an extended compression format for spmv on shared memory systems

K Kourtis, V Karakasis, G Goumas, N Koziris - ACM SIGPLAN Notices, 2011 - dl.acm.org
The Sparse Matrix-Vector multiplication (SpMV) kernel scales poorly on shared memory
systems with multiple processing units due to the streaming nature of its data access pattern …

Optimizing sparse matrix-vector multiplication using index and value compression

K Kourtis, G Goumas, N Koziris - Proceedings of the 5th conference on …, 2008 - dl.acm.org
Previous research work has identified memory bandwidth as the main bottleneck of the
ubiquitous Sparse Matrix-Vector Multiplication kernel. To attack this problem, we aim at …

Performance evaluation of the sparse matrix-vector multiplication on modern architectures

G Goumas, K Kourtis, N Anastopoulos… - The Journal of …, 2009 - Springer
In this paper, we revisit the performance issues of the widely used sparse matrix-vector
multiplication (SpMxV) kernel on modern microarchitectures. Previous scientific work reports …

Fast conjugate gradients with multiple GPUs

A Cevahir, A Nukada, S Matsuoka - … Baton Rouge, LA, USA, May 25-27 …, 2009 - Springer
The limiting factor for efficiency of sparse linear solvers is the memory bandwidth. In this
work, we describe a fast Conjugate Gradient solver for unstructured problems, which runs on …

Understanding the performance of sparse matrix-vector multiplication

G Goumas, K Kourtis, N Anastopoulos… - … and Network-Based …, 2008 - ieeexplore.ieee.org
In this paper we revisit the performance issues of the widely used sparse matrix-vector
multiplication (SpMxV) kernel on modern microarchitectures. Previous scientific work reports …

Performance analysis and optimization of sparse matrix-vector multiplication on modern multi-and many-core processors

A Elafrou, G Goumas, N Koziris - 2017 46th International …, 2017 - ieeexplore.ieee.org
This paper presents a low-overhead optimizer for the ubiquitous sparse matrix-vector
multiplication (SpMV) kernel. Architectural diversity among different processors together with …

Learning sparse matrix row permutations for efficient spmm on gpu architectures

A Mehrabi, D Lee, N Chatterjee… - … Analysis of Systems …, 2021 - ieeexplore.ieee.org
Achieving peak performance on sparse operations is challenging. The distribution of the non-
zero elements and underlying hardware platform affect the execution efficiency. Given the …

Bringing Order to Sparsity: A Sparse Matrix Reordering Study on Multicore CPUs

JD Trotter, S Ekmekçibaşı, J Langguth, T Torun… - Proceedings of the …, 2023 - dl.acm.org
Many real-world computations involve sparse data structures in the form of sparse matrices.
A common strategy for optimizing sparse matrix operations is to reorder a matrix to improve …

Structured matrices and their application in neural networks: A survey

M Kissel, K Diepold - New Generation Computing, 2023 - Springer
Modern neural network architectures are becoming larger and deeper, with increasing
computational resources needed for training and inference. One approach toward handling …

Improving the performance of multithreaded sparse matrix-vector multiplication using index and value compression

K Kourtis, G Goumas, N Koziris - 2008 37th International …, 2008 - ieeexplore.ieee.org
The sparse matrix-vector multiplication kernel exhibits limited potential for taking advantage
of modern shared memory architectures due to its large memory bandwidth requirements …