A survey on approximate edge AI for energy efficient autonomous driving services
Autonomous driving services depends on active sensing from modules such as camera,
LiDAR, radar, and communication units. Traditionally, these modules process the sensed …
LiDAR, radar, and communication units. Traditionally, these modules process the sensed …
Optimally scheduling CNN convolutions for efficient memory access
Embedded inference engines for convolutional networks must be parsimonious in memory
bandwidth and buffer sizing to meet power and cost constraints. We present an analytical …
bandwidth and buffer sizing to meet power and cost constraints. We present an analytical …
Research on NVIDIA deep learning accelerator
G Zhou, J Zhou, H Lin - 2018 12th IEEE International …, 2018 - ieeexplore.ieee.org
This paper introduces the NVIDIA deep learning accelerator (NVDLA), including its
hardware architecture specification and software environment. At the same time, the basic …
hardware architecture specification and software environment. At the same time, the basic …
CUTIE: Beyond PetaOp/s/W ternary DNN inference acceleration with better-than-binary energy efficiency
We present a 3.1 POp/s/W fully digital hardware accelerator for ternary neural networks
(TNNs). CUTIE, the completely unrolled ternary inference engine, focuses on minimizing …
(TNNs). CUTIE, the completely unrolled ternary inference engine, focuses on minimizing …
A 2.5 μW KWS Engine With Pruned LSTM and Embedded MFCC for IoT Applications
Always-on keyword spotting (KWS) hardware is gaining popularity in ultra-low power IoT
applications where specific words are used to wake up and activate the power hungry …
applications where specific words are used to wake up and activate the power hungry …
Design of a sparsity-aware reconfigurable deep learning accelerator supporting various types of operations
The superiority of various Deep Neural Networks (DNN) models, such as Convolutional
Neural Networks (CNN), Generative Adversarial Networks (GAN), and Recurrent Neural …
Neural Networks (CNN), Generative Adversarial Networks (GAN), and Recurrent Neural …
[HTML][HTML] Simplify: A Python library for optimizing pruned neural networks
A Bragagnolo, CA Barbano - SoftwareX, 2022 - Elsevier
Neural network pruning allows for impressive theoretical reduction of models sizes and
complexity. However it usually offers little practical benefits as it is most often limited to just …
complexity. However it usually offers little practical benefits as it is most often limited to just …
Amortized neural networks for low-latency speech recognition
We introduce Amortized Neural Networks (AmNets), a compute cost-and latency-aware
network architecture particularly well-suited for sequence modeling tasks. We apply AmNets …
network architecture particularly well-suited for sequence modeling tasks. We apply AmNets …
Evaluating robustness to noise and compression of deep neural networks for keyword spotting
PH Pereira, W Beccaro, MA Ramírez - IEEE Access, 2023 - ieeexplore.ieee.org
Keyword Spotting (KWS) has been the subject of research in recent years given the increase
of embedded systems for command recognition such as Alexa, Google Home, and Siri …
of embedded systems for command recognition such as Alexa, Google Home, and Siri …
Activation sparsity and dynamic pruning for split computing in edge AI
J Haberer, O Landsiedel - … of the 3rd International Workshop on …, 2022 - dl.acm.org
Deep neural networks are getting larger and, therefore, harder to deploy on constrained IoT
devices. Split computing provides a solution by splitting a network and placing the first few …
devices. Split computing provides a solution by splitting a network and placing the first few …