Buffets: An efficient and composable storage idiom for explicit decoupled data orchestration

CE Leiserson, NC Thompson, JS Emer, BC Kuszmaul… - Science, 2020 - science.org

BACKGROUND Improvements in computing power can claim a large share of the credit for
many of the things that we take for granted in our modern lives: cellphones that are more …

被引用次数：481 相关文章所有 7 个版本

[PDF] mit.edu

Timeloop: A systematic approach to dnn accelerator evaluation

A Parashar, P Raina, YS Shao, YH Chen… - … analysis of systems …, 2019 - ieeexplore.ieee.org

This paper presents Timeloop, an infrastructure for evaluating and exploring the architecture
design space of deep neural network (DNN) accelerators. Timeloop uses a concise and …

被引用次数：467 相关文章所有 13 个版本

[PDF] mit.edu

[图书][B] Efficient processing of deep neural networks

V Sze, YH Chen, TJ Yang, JS Emer - 2020 - Springer

This book provides a structured treatment of the key principles and techniques for enabling
efficient processing of deep neural networks (DNNs). DNNs are currently widely used for …

被引用次数：282 相关文章所有 6 个版本

[PDF] academia.edu

Extensor: An accelerator for sparse tensor algebra

K Hegde, H Asghari-Moghaddam, M Pellauer… - Proceedings of the …, 2019 - dl.acm.org

Generalized tensor algebra is a prime candidate for acceleration via customized ASICs.
Modern tensors feature a wide range of data sparsity, with the density of non-zero elements …

被引用次数：257 相关文章所有 9 个版本

[PDF] horizon-lab.org

A systematic methodology for characterizing scalability of dnn accelerators using scale-sim

A Samajdar, JM Joseph, Y Zhu… - … Analysis of Systems …, 2020 - ieeexplore.ieee.org

The compute demand for deep learning workloads is well known and is a prime motivator for
powerful parallel computing platforms such as GPUs or dedicated hardware accelerators …

被引用次数：205 相关文章所有 5 个版本

[PDF] mit.edu

Magnet: A modular accelerator generator for neural networks

R Venkatesan, YS Shao, M Wang… - 2019 IEEE/ACM …, 2019 - ieeexplore.ieee.org

Deep neural networks have been adopted in a wide range of application domains, leading
to high demand for inference accelerators. However, the high cost associated with ASIC …

被引用次数：135 相关文章所有 4 个版本

[PDF] nsf.gov

Dsagen: Synthesizing programmable spatial accelerators

J Weng, S Liu, V Dadu, Z Wang, P Shah… - 2020 ACM/IEEE 47th …, 2020 - ieeexplore.ieee.org

Domain-specific hardware accelerators can provide orders of magnitude speedup and
energy efficiency over general purpose processors. However, they require extensive manual …

被引用次数：124 相关文章所有 10 个版本

[PDF] ieee.org

Stonne: Enabling cycle-level microarchitectural simulation for dnn inference accelerators

F Muñoz-Martínez, JL Abellán… - 2021 IEEE …, 2021 - ieeexplore.ieee.org

The design of specialized architectures for accelerating the inference procedure of Deep
Neural Networks (DNNs) is a booming area of research nowadays. While first-generation …

被引用次数：66 相关文章所有 5 个版本

[PDF] firoozshahian.com

Mtia: First generation silicon targeting meta's recommendation systems

A Firoozshahian, J Coburn, R Levenstein… - Proceedings of the 50th …, 2023 - dl.acm.org

Meta has traditionally relied on using CPU-based servers for running inference workloads,
specifically Deep Learning Recommendation Models (DLRM), but the increasing compute …

被引用次数：26 相关文章所有 4 个版本

[PDF] ctorng.com

Ultra-elastic cgras for irregular loop specialization

C Torng, P Pan, Y Ou, C Tan… - 2021 IEEE International …, 2021 - ieeexplore.ieee.org

Reconfigurable accelerator fabrics, including coarse-grain reconfigurable arrays (CGRAs),
have experienced a resurgence in interest because they allow fast-paced software algorithm …

被引用次数：59 相关文章所有 6 个版本