Demystifying parallel and distributed deep learning: An in-depth concurrency analysis

T Ben-Nun, T Hoefler - ACM Computing Surveys (CSUR), 2019 - dl.acm.org
Deep Neural Networks (DNNs) are becoming an important tool in modern computing
applications. Accelerating their training is a major challenge and techniques range from …

Demystifying graph databases: Analysis and taxonomy of data organization, system designs, and graph queries

M Besta, R Gerstenberger, E Peter, M Fischer… - ACM Computing …, 2023 - dl.acm.org
Numerous irregular graph datasets, for example social networks or web graphs, may contain
even trillions of edges. Often, their structure changes over time and they have domain …

The tensor algebra compiler

F Kjolstad, S Kamil, S Chou, D Lugato… - Proceedings of the …, 2017 - dl.acm.org
Tensor algebra is a powerful tool with applications in machine learning, data analytics,
engineering and the physical sciences. Tensors are often sparse and compound operations …

Parallel and distributed graph neural networks: An in-depth concurrency analysis

M Besta, T Hoefler - IEEE Transactions on Pattern Analysis and …, 2024 - ieeexplore.ieee.org
Graph neural networks (GNNs) are among the most powerful tools in deep learning. They
routinely solve complex problems on unstructured networks, such as node classification …

Compiler support for sparse tensor computations in MLIR

A Bik, P Koanantakool, T Shpeisman… - ACM Transactions on …, 2022 - dl.acm.org
Sparse tensors arise in problems in science, engineering, machine learning, and data
analytics. Programs that operate on such tensors can exploit sparsity to reduce storage …

Communication-efficient jaccard similarity for high-performance distributed genome comparisons

M Besta, R Kanakagiri, H Mustafa… - 2020 IEEE …, 2020 - ieeexplore.ieee.org
The Jaccard similarity index is an important measure of the overlap of two sets, widely used
in machine learning, computational genomics, information retrieval, and many other areas …

Scaling betweenness centrality using communication-efficient sparse matrix multiplication

E Solomonik, M Besta, F Vella, T Hoefler - Proceedings of the …, 2017 - dl.acm.org
Betweenness centrality (BC) is a crucial graph problem that measures the significance of a
vertex by the number of shortest paths leading through it. We propose Maximal Frontier …

Compilation of sparse array programming models

R Henry, O Hsu, R Yadav, S Chou, K Olukotun… - Proceedings of the …, 2021 - dl.acm.org
This paper shows how to compile sparse array programming languages. A sparse array
programming language is an array programming language that supports element-wise …

Mosaic: An interoperable compiler for tensor algebra

M Bansal, O Hsu, K Olukotun, F Kjolstad - Proceedings of the ACM on …, 2023 - dl.acm.org
We introduce Mosaic, a sparse tensor algebra compiler that can bind tensor expressions to
external functions of other tensor algebra libraries and compilers. Users can extend Mosaic …

Combinatorial BLAS 2.0: Scaling combinatorial algorithms on distributed-memory systems

A Azad, O Selvitopi, MT Hussain… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Combinatorial algorithms such as those that arise in graph analysis, modeling of discrete
systems, bioinformatics, and chemistry, are often hard to parallelize. The Combinatorial …