A survey on deep learning hardware accelerators for heterogeneous hpc platforms
Recent trends in deep learning (DL) imposed hardware accelerators as the most viable
solution for several classes of high-performance computing (HPC) applications such as …
solution for several classes of high-performance computing (HPC) applications such as …
Tileflow: A framework for modeling fusion dataflow via tree-based analysis
With the increasing size of DNN models and the growing discrepancy between compute
performance and memory bandwidth, fusing multiple layers together to reduce off-chip …
performance and memory bandwidth, fusing multiple layers together to reduce off-chip …
Teaal: A declarative framework for modeling sparse tensor accelerators
Over the past few years, the explosion in sparse tensor algebra workloads has led to a
corresponding rise in domain-specific accelerators to service them. Due to the irregularity …
corresponding rise in domain-specific accelerators to service them. Due to the irregularity …
Spade: A flexible and scalable accelerator for spmm and sddmm
The widespread use of Sparse Matrix Dense Matrix Multiplication (SpMM) and Sampled
Dense Matrix Dense Matrix Multiplication (SDDMM) kernels makes them candidates for …
Dense Matrix Dense Matrix Multiplication (SDDMM) kernels makes them candidates for …
Sparse-DySta: Sparsity-Aware Dynamic and Static Scheduling for Sparse Multi-DNN Workloads
Running multiple deep neural networks (DNNs) in parallel has become an emerging
workload in both edge devices, such as mobile phones where multiple tasks serve a single …
workload in both edge devices, such as mobile phones where multiple tasks serve a single …
Muchisim: A simulation framework for design exploration of multi-chip manycore systems
The design space exploration of scaled-out manycores for communication-intensive
applications (eg, graph analytics and sparse linear algebra) is hampered due to either lack …
applications (eg, graph analytics and sparse linear algebra) is hampered due to either lack …
Spatula: A hardware accelerator for sparse matrix factorization
A Feldmann, D Sanchez - Proceedings of the 56th Annual IEEE/ACM …, 2023 - dl.acm.org
Solving sparse systems of linear equations is a crucial component in many science and
engineering problems, like simulating physical systems. Sparse matrix factorization …
engineering problems, like simulating physical systems. Sparse matrix factorization …
FlatDD: A High-Performance Quantum Circuit Simulator using Decision Diagram and Flat Array
Quantum circuit simulator (QCS) is essential for designing quantum algorithms because it
assists researchers in understanding how quantum operations work without access to …
assists researchers in understanding how quantum operations work without access to …
The EDGE language: Extended general einsums for graph algorithms
In this work, we propose a unified abstraction for graph algorithms: the Extended General
Einsums language, or EDGE. The EDGE language expresses graph algorithms in the …
Einsums language, or EDGE. The EDGE language expresses graph algorithms in the …
Dedicated hardware accelerators for processing of sparse matrices and vectors: a survey
V Isaac–Chassande, A Evans, Y Durand… - ACM Transactions on …, 2024 - dl.acm.org
Performance in scientific and engineering applications such as computational physics,
algebraic graph problems or Convolutional Neural Networks (CNN), is dominated by the …
algebraic graph problems or Convolutional Neural Networks (CNN), is dominated by the …