A survey on approximate edge AI for energy efficient autonomous driving services

D Katare, D Perino, J Nurmi, M Warnier… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org
Autonomous driving services depends on active sensing from modules such as camera,
LiDAR, radar, and communication units. Traditionally, these modules process the sensed …

Optimally scheduling CNN convolutions for efficient memory access

A Stoutchinin, F Conti, L Benini - arXiv preprint arXiv:1902.01492, 2019 - arxiv.org
Embedded inference engines for convolutional networks must be parsimonious in memory
bandwidth and buffer sizing to meet power and cost constraints. We present an analytical …

Research on NVIDIA deep learning accelerator

G Zhou, J Zhou, H Lin - 2018 12th IEEE International …, 2018 - ieeexplore.ieee.org
This paper introduces the NVIDIA deep learning accelerator (NVDLA), including its
hardware architecture specification and software environment. At the same time, the basic …

CUTIE: Beyond PetaOp/s/W ternary DNN inference acceleration with better-than-binary energy efficiency

M Scherer, G Rutishauser, L Cavigelli… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
We present a 3.1 POp/s/W fully digital hardware accelerator for ternary neural networks
(TNNs). CUTIE, the completely unrolled ternary inference engine, focuses on minimizing …

A 2.5 μW KWS Engine With Pruned LSTM and Embedded MFCC for IoT Applications

YS Chong, WL Goh, VP Nambiar… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Always-on keyword spotting (KWS) hardware is gaining popularity in ultra-low power IoT
applications where specific words are used to wake up and activate the power hungry …

Design of a sparsity-aware reconfigurable deep learning accelerator supporting various types of operations

SF Hsiao, KC Chen, CC Lin… - IEEE Journal on …, 2020 - ieeexplore.ieee.org
The superiority of various Deep Neural Networks (DNN) models, such as Convolutional
Neural Networks (CNN), Generative Adversarial Networks (GAN), and Recurrent Neural …

[HTML][HTML] Simplify: A Python library for optimizing pruned neural networks

A Bragagnolo, CA Barbano - SoftwareX, 2022 - Elsevier
Neural network pruning allows for impressive theoretical reduction of models sizes and
complexity. However it usually offers little practical benefits as it is most often limited to just …

Amortized neural networks for low-latency speech recognition

J Macoskey, GP Strimel, J Su, A Rastrow - arXiv preprint arXiv:2108.01553, 2021 - arxiv.org
We introduce Amortized Neural Networks (AmNets), a compute cost-and latency-aware
network architecture particularly well-suited for sequence modeling tasks. We apply AmNets …

Evaluating robustness to noise and compression of deep neural networks for keyword spotting

PH Pereira, W Beccaro, MA Ramírez - IEEE Access, 2023 - ieeexplore.ieee.org
Keyword Spotting (KWS) has been the subject of research in recent years given the increase
of embedded systems for command recognition such as Alexa, Google Home, and Siri …

Activation sparsity and dynamic pruning for split computing in edge AI

J Haberer, O Landsiedel - … of the 3rd International Workshop on …, 2022 - dl.acm.org
Deep neural networks are getting larger and, therefore, harder to deploy on constrained IoT
devices. Split computing provides a solution by splitting a network and placing the first few …