A survey on deep learning hardware accelerators for heterogeneous hpc platforms

C Silvano, D Ielmini, F Ferrandi, L Fiorin… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent trends in deep learning (DL) imposed hardware accelerators as the most viable
solution for several classes of high-performance computing (HPC) applications such as …

Tileflow: A framework for modeling fusion dataflow via tree-based analysis

S Zheng, S Chen, S Gao, L Jia, G Sun… - Proceedings of the 56th …, 2023 - dl.acm.org
With the increasing size of DNN models and the growing discrepancy between compute
performance and memory bandwidth, fusing multiple layers together to reduce off-chip …

Teaal: A declarative framework for modeling sparse tensor accelerators

N Nayak, TO Odemuyiwa, S Ugare, C Fletcher… - Proceedings of the 56th …, 2023 - dl.acm.org
Over the past few years, the explosion in sparse tensor algebra workloads has led to a
corresponding rise in domain-specific accelerators to service them. Due to the irregularity …

Spade: A flexible and scalable accelerator for spmm and sddmm

G Gerogiannis, S Yesil, D Lenadora, D Cao… - Proceedings of the 50th …, 2023 - dl.acm.org
The widespread use of Sparse Matrix Dense Matrix Multiplication (SpMM) and Sampled
Dense Matrix Dense Matrix Multiplication (SDDMM) kernels makes them candidates for …

Sparse-DySta: Sparsity-Aware Dynamic and Static Scheduling for Sparse Multi-DNN Workloads

H Fan, SI Venieris, A Kouris, N Lane - … of the 56th Annual IEEE/ACM …, 2023 - dl.acm.org
Running multiple deep neural networks (DNNs) in parallel has become an emerging
workload in both edge devices, such as mobile phones where multiple tasks serve a single …

Muchisim: A simulation framework for design exploration of multi-chip manycore systems

M Orenes-Vera, E Tureci, M Martonosi… - … Analysis of Systems …, 2024 - ieeexplore.ieee.org
The design space exploration of scaled-out manycores for communication-intensive
applications (eg, graph analytics and sparse linear algebra) is hampered due to either lack …

Spatula: A hardware accelerator for sparse matrix factorization

A Feldmann, D Sanchez - Proceedings of the 56th Annual IEEE/ACM …, 2023 - dl.acm.org
Solving sparse systems of linear equations is a crucial component in many science and
engineering problems, like simulating physical systems. Sparse matrix factorization …

FlatDD: A High-Performance Quantum Circuit Simulator using Decision Diagram and Flat Array

S Jiang, R Fu, L Burgholzer, R Wille, TY Ho… - Proceedings of the 53rd …, 2024 - dl.acm.org
Quantum circuit simulator (QCS) is essential for designing quantum algorithms because it
assists researchers in understanding how quantum operations work without access to …

The EDGE language: Extended general einsums for graph algorithms

TO Odemuyiwa, JS Emer, JD Owens - arXiv preprint arXiv:2404.11591, 2024 - arxiv.org
In this work, we propose a unified abstraction for graph algorithms: the Extended General
Einsums language, or EDGE. The EDGE language expresses graph algorithms in the …

Dedicated hardware accelerators for processing of sparse matrices and vectors: a survey

V Isaac–Chassande, A Evans, Y Durand… - ACM Transactions on …, 2024 - dl.acm.org
Performance in scientific and engineering applications such as computational physics,
algebraic graph problems or Convolutional Neural Networks (CNN), is dominated by the …