Communication lower bounds and optimal algorithms for numerical linear algebra
The traditional metric for the efficiency of a numerical algorithm has been the number of
arithmetic operations it performs. Technological trends have long been reducing the time to …
arithmetic operations it performs. Technological trends have long been reducing the time to …
Mesh-tensorflow: Deep learning for supercomputers
Abstract Batch-splitting (data-parallelism) is the dominant distributed Deep Neural Network
(DNN) training strategy, due to its universal applicability and its amenability to Single …
(DNN) training strategy, due to its universal applicability and its amenability to Single …
A systematic survey of general sparse matrix-matrix multiplication
General Sparse Matrix-Matrix Multiplication (SpGEMM) has attracted much attention from
researchers in graph analyzing, scientific computing, and deep learning. Many optimization …
researchers in graph analyzing, scientific computing, and deep learning. Many optimization …
Mathematical foundations of the GraphBLAS
The GraphBLAS standard (GraphBlas. org) is being developed to bring the potential of
matrix-based graph algorithms to the broadest possible audience. Mathematically, the …
matrix-based graph algorithms to the broadest possible audience. Mathematically, the …
Sparse matrix multiplication: The distributed block-compressed sparse row library
U Borštnik, J VandeVondele, V Weber, J Hutter - Parallel Computing, 2014 - Elsevier
Efficient parallel multiplication of sparse matrices is key to enabling many large-scale
calculations. This article presents the DBCSR (Distributed Block Compressed Sparse Row) …
calculations. This article presents the DBCSR (Distributed Block Compressed Sparse Row) …
TileSpGEMM: A tiled algorithm for parallel sparse general matrix-matrix multiplication on GPUs
Sparse general matrix-matrix multiplication (SpGEMM) is one of the most fundamental
building blocks in sparse linear solvers, graph processing frameworks and machine learning …
building blocks in sparse linear solvers, graph processing frameworks and machine learning …
Parallel triangle counting and enumeration using matrix algebra
Triangle counting and enumeration are important kernels that are used to characterize
graphs. They are also used to compute important statistics such as clustering coefficients …
graphs. They are also used to compute important statistics such as clustering coefficients …
An efficient GPU general sparse matrix-matrix multiplication for irregular data
W Liu, B Vinter - 2014 IEEE 28th international parallel and …, 2014 - ieeexplore.ieee.org
General sparse matrix-matrix multiplication (SpGEMM) is a fundamental building block for
numerous applications such as algebraic multigrid method, breadth first search and shortest …
numerous applications such as algebraic multigrid method, breadth first search and shortest …
Communication-efficient jaccard similarity for high-performance distributed genome comparisons
The Jaccard similarity index is an important measure of the overlap of two sets, widely used
in machine learning, computational genomics, information retrieval, and many other areas …
in machine learning, computational genomics, information retrieval, and many other areas …
Exploiting multiple levels of parallelism in sparse matrix-matrix multiplication
Sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many high-
performance graph algorithms as well as for some linear solvers, such as algebraic …
performance graph algorithms as well as for some linear solvers, such as algebraic …