There's plenty of room at the Top: What will drive computer performance after Moore's law?

CE Leiserson, NC Thompson, JS Emer, BC Kuszmaul… - Science, 2020 - science.org
BACKGROUND Improvements in computing power can claim a large share of the credit for
many of the things that we take for granted in our modern lives: cellphones that are more …

Timeloop: A systematic approach to dnn accelerator evaluation

A Parashar, P Raina, YS Shao, YH Chen… - … analysis of systems …, 2019 - ieeexplore.ieee.org
This paper presents Timeloop, an infrastructure for evaluating and exploring the architecture
design space of deep neural network (DNN) accelerators. Timeloop uses a concise and …

[图书][B] Efficient processing of deep neural networks

V Sze, YH Chen, TJ Yang, JS Emer - 2020 - Springer
This book provides a structured treatment of the key principles and techniques for enabling
efficient processing of deep neural networks (DNNs). DNNs are currently widely used for …

Extensor: An accelerator for sparse tensor algebra

K Hegde, H Asghari-Moghaddam, M Pellauer… - Proceedings of the …, 2019 - dl.acm.org
Generalized tensor algebra is a prime candidate for acceleration via customized ASICs.
Modern tensors feature a wide range of data sparsity, with the density of non-zero elements …

A systematic methodology for characterizing scalability of dnn accelerators using scale-sim

A Samajdar, JM Joseph, Y Zhu… - … Analysis of Systems …, 2020 - ieeexplore.ieee.org
The compute demand for deep learning workloads is well known and is a prime motivator for
powerful parallel computing platforms such as GPUs or dedicated hardware accelerators …

Magnet: A modular accelerator generator for neural networks

R Venkatesan, YS Shao, M Wang… - 2019 IEEE/ACM …, 2019 - ieeexplore.ieee.org
Deep neural networks have been adopted in a wide range of application domains, leading
to high demand for inference accelerators. However, the high cost associated with ASIC …

Dsagen: Synthesizing programmable spatial accelerators

J Weng, S Liu, V Dadu, Z Wang, P Shah… - 2020 ACM/IEEE 47th …, 2020 - ieeexplore.ieee.org
Domain-specific hardware accelerators can provide orders of magnitude speedup and
energy efficiency over general purpose processors. However, they require extensive manual …

Stonne: Enabling cycle-level microarchitectural simulation for dnn inference accelerators

F Muñoz-Martínez, JL Abellán… - 2021 IEEE …, 2021 - ieeexplore.ieee.org
The design of specialized architectures for accelerating the inference procedure of Deep
Neural Networks (DNNs) is a booming area of research nowadays. While first-generation …

Mtia: First generation silicon targeting meta's recommendation systems

A Firoozshahian, J Coburn, R Levenstein… - Proceedings of the 50th …, 2023 - dl.acm.org
Meta has traditionally relied on using CPU-based servers for running inference workloads,
specifically Deep Learning Recommendation Models (DLRM), but the increasing compute …

Ultra-elastic cgras for irregular loop specialization

C Torng, P Pan, Y Ou, C Tan… - 2021 IEEE International …, 2021 - ieeexplore.ieee.org
Reconfigurable accelerator fabrics, including coarse-grain reconfigurable arrays (CGRAs),
have experienced a resurgence in interest because they allow fast-paced software algorithm …