Hardware acceleration of sparse and irregular tensor computations of ml models: A survey and insights
Machine learning (ML) models are widely used in many important domains. For efficiently
processing these computational-and memory-intensive applications, tensors of these …
processing these computational-and memory-intensive applications, tensors of these …
Snap: An efficient sparse neural acceleration processor for unstructured sparse deep neural network inference
Recent developments in deep neural network (DNN) pruning introduces data sparsity to
enable deep learning applications to run more efficiently on resourceand energy …
enable deep learning applications to run more efficiently on resourceand energy …
Spada: Accelerating sparse matrix multiplication with adaptive dataflow
Sparse matrix-matrix multiplication (SpGEMM) is widely used in many scientific and deep
learning applications. The highly irregular structures of SpGEMM limit its performance and …
learning applications. The highly irregular structures of SpGEMM limit its performance and …
Hardware accelerator design for sparse dnn inference and training: A tutorial
Deep neural networks (DNNs) are widely used in many fields, such as artificial intelligence
generated content (AIGC) and robotics. To efficiently support these tasks, the model pruning …
generated content (AIGC) and robotics. To efficiently support these tasks, the model pruning …
Z-PIM: A sparsity-aware processing-in-memory architecture with fully variable weight bit-precision for energy-efficient deep neural networks
We present an energy-efficient processing-in-memory (PIM) architecture named Z-PIM that
supports both sparsity handling and fully variable bit-precision in weight data for energy …
supports both sparsity handling and fully variable bit-precision in weight data for energy …
CNN inference using a preprocessing precision controller and approximate multipliers with various precisions
I Hammad, L Li, K El-Sankary, WM Snelgrove - IEEE Access, 2021 - ieeexplore.ieee.org
This article proposes boosting the multiplication performance for convolutional neural
network (CNN) inference using a precision prediction preprocessor which controls various …
network (CNN) inference using a precision prediction preprocessor which controls various …
Memory-efficient CNN accelerator based on interlayer feature map compression
Z Shao, X Chen, L Du, L Chen, Y Du… - … on Circuits and …, 2021 - ieeexplore.ieee.org
Existing deep convolutional neural networks (CNNs) generate massive interlayer feature
data during network inference. To maintain real-time processing in embedded systems …
data during network inference. To maintain real-time processing in embedded systems …
An efficient unstructured sparse convolutional neural network accelerator for wearable ECG classification device
J Lu, D Liu, X Cheng, L Wei, A Hu… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Convolution neural network (CNN) with pruning techniques has shown remarkable
prospects in electrocardiogram (ECG) classification. However, efficiently deploying the …
prospects in electrocardiogram (ECG) classification. However, efficiently deploying the …
CUTIE: Beyond PetaOp/s/W ternary DNN inference acceleration with better-than-binary energy efficiency
We present a 3.1 POp/s/W fully digital hardware accelerator for ternary neural networks
(TNNs). CUTIE, the completely unrolled ternary inference engine, focuses on minimizing …
(TNNs). CUTIE, the completely unrolled ternary inference engine, focuses on minimizing …
Trainer: An energy-efficient edge-device training processor supporting dynamic weight pruning
Transfer learning, which transfers knowledge from source datasets to target datasets, is
practical for adaptive deep neural network (DNN) applications. When considering user …
practical for adaptive deep neural network (DNN) applications. When considering user …