Addressing sparsity in deep neural networks

D Katare, D Perino, J Nurmi, M Warnier… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org

Autonomous driving services depends on active sensing from modules such as camera,
LiDAR, radar, and communication units. Traditionally, these modules process the sensed …

被引用次数：41 相关文章所有 11 个版本

[PDF] arxiv.org

Optimally scheduling CNN convolutions for efficient memory access

A Stoutchinin, F Conti, L Benini - arXiv preprint arXiv:1902.01492, 2019 - arxiv.org

Embedded inference engines for convolutional networks must be parsimonious in memory
bandwidth and buffer sizing to meet power and cost constraints. We present an analytical …

被引用次数：58 相关文章所有 4 个版本

Research on NVIDIA deep learning accelerator

G Zhou, J Zhou, H Lin - 2018 12th IEEE International …, 2018 - ieeexplore.ieee.org

This paper introduces the NVIDIA deep learning accelerator (NVDLA), including its
hardware architecture specification and software environment. At the same time, the basic …

被引用次数：54 相关文章

[PDF] arxiv.org

CUTIE: Beyond PetaOp/s/W ternary DNN inference acceleration with better-than-binary energy efficiency

M Scherer, G Rutishauser, L Cavigelli… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org

We present a 3.1 POp/s/W fully digital hardware accelerator for ternary neural networks
(TNNs). CUTIE, the completely unrolled ternary inference engine, focuses on minimizing …

被引用次数：28 相关文章所有 9 个版本

[PDF] a-star.edu.sg

A 2.5 μW KWS Engine With Pruned LSTM and Embedded MFCC for IoT Applications

YS Chong, WL Goh, VP Nambiar… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org

Always-on keyword spotting (KWS) hardware is gaining popularity in ultra-low power IoT
applications where specific words are used to wake up and activate the power hungry …

被引用次数：17 相关文章所有 2 个版本

Design of a sparsity-aware reconfigurable deep learning accelerator supporting various types of operations

SF Hsiao, KC Chen, CC Lin… - IEEE Journal on …, 2020 - ieeexplore.ieee.org

The superiority of various Deep Neural Networks (DNN) models, such as Convolutional
Neural Networks (CNN), Generative Adversarial Networks (GAN), and Recurrent Neural …

被引用次数：23 相关文章所有 3 个版本

[HTML] sciencedirect.com

[HTML][HTML] Simplify: A Python library for optimizing pruned neural networks

A Bragagnolo, CA Barbano - SoftwareX, 2022 - Elsevier

Neural network pruning allows for impressive theoretical reduction of models sizes and
complexity. However it usually offers little practical benefits as it is most often limited to just …

被引用次数：17 相关文章所有 7 个版本

[PDF] arxiv.org

Amortized neural networks for low-latency speech recognition

J Macoskey, GP Strimel, J Su, A Rastrow - arXiv preprint arXiv:2108.01553, 2021 - arxiv.org

We introduce Amortized Neural Networks (AmNets), a compute cost-and latency-aware
network architecture particularly well-suited for sequence modeling tasks. We apply AmNets …

被引用次数：16 相关文章所有 5 个版本

[PDF] ieee.org

Evaluating robustness to noise and compression of deep neural networks for keyword spotting

PH Pereira, W Beccaro, MA Ramírez - IEEE Access, 2023 - ieeexplore.ieee.org

Keyword Spotting (KWS) has been the subject of research in recent years given the increase
of embedded systems for command recognition such as Alexa, Google Home, and Siri …

被引用次数：2 相关文章所有 3 个版本

[PDF] uni-kiel.de

Activation sparsity and dynamic pruning for split computing in edge AI

J Haberer, O Landsiedel - … of the 3rd International Workshop on …, 2022 - dl.acm.org

Deep neural networks are getting larger and, therefore, harder to deploy on constrained IoT
devices. Split computing provides a solution by splitting a network and placing the first few …

被引用次数：3 相关文章所有 3 个版本