Demystifying parallel and distributed deep learning: An in-depth concurrency analysis
Deep Neural Networks (DNNs) are becoming an important tool in modern computing
applications. Accelerating their training is a major challenge and techniques range from …
applications. Accelerating their training is a major challenge and techniques range from …
Demystifying graph databases: Analysis and taxonomy of data organization, system designs, and graph queries
Numerous irregular graph datasets, for example social networks or web graphs, may contain
even trillions of edges. Often, their structure changes over time and they have domain …
even trillions of edges. Often, their structure changes over time and they have domain …
The tensor algebra compiler
Tensor algebra is a powerful tool with applications in machine learning, data analytics,
engineering and the physical sciences. Tensors are often sparse and compound operations …
engineering and the physical sciences. Tensors are often sparse and compound operations …
Parallel and distributed graph neural networks: An in-depth concurrency analysis
Graph neural networks (GNNs) are among the most powerful tools in deep learning. They
routinely solve complex problems on unstructured networks, such as node classification …
routinely solve complex problems on unstructured networks, such as node classification …
Compiler support for sparse tensor computations in MLIR
Sparse tensors arise in problems in science, engineering, machine learning, and data
analytics. Programs that operate on such tensors can exploit sparsity to reduce storage …
analytics. Programs that operate on such tensors can exploit sparsity to reduce storage …
Communication-efficient jaccard similarity for high-performance distributed genome comparisons
The Jaccard similarity index is an important measure of the overlap of two sets, widely used
in machine learning, computational genomics, information retrieval, and many other areas …
in machine learning, computational genomics, information retrieval, and many other areas …
Scaling betweenness centrality using communication-efficient sparse matrix multiplication
Betweenness centrality (BC) is a crucial graph problem that measures the significance of a
vertex by the number of shortest paths leading through it. We propose Maximal Frontier …
vertex by the number of shortest paths leading through it. We propose Maximal Frontier …
Compilation of sparse array programming models
This paper shows how to compile sparse array programming languages. A sparse array
programming language is an array programming language that supports element-wise …
programming language is an array programming language that supports element-wise …
Mosaic: An interoperable compiler for tensor algebra
We introduce Mosaic, a sparse tensor algebra compiler that can bind tensor expressions to
external functions of other tensor algebra libraries and compilers. Users can extend Mosaic …
external functions of other tensor algebra libraries and compilers. Users can extend Mosaic …
Combinatorial BLAS 2.0: Scaling combinatorial algorithms on distributed-memory systems
Combinatorial algorithms such as those that arise in graph analysis, modeling of discrete
systems, bioinformatics, and chemistry, are often hard to parallelize. The Combinatorial …
systems, bioinformatics, and chemistry, are often hard to parallelize. The Combinatorial …