An Efficient CNN Inference Accelerator Based on Intra-and Inter-Channel Feature Map Compression
C Xie, Z Shao, N Zhao, Y Du… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Deep convolutional neural networks (CNNs) generate intensive inter-layer data during
inference, which results in substantial on-chip memory size and off-chip bandwidth. To solve …
inference, which results in substantial on-chip memory size and off-chip bandwidth. To solve …
[HTML][HTML] Minimizing Global Buffer Access in a Deep Learning Accelerator Using a Local Register File with a Rearranged Computational Sequence
We propose a method for minimizing global buffer access within a deep learning accelerator
for convolution operations by maximizing the data reuse through a local register file, thereby …
for convolution operations by maximizing the data reuse through a local register file, thereby …
MOSDA: On-Chip Memory Optimized Sparse Deep Neural Network Accelerator with Efficient Index Matching
The irregular data access pattern caused by sparsity brings great challenges to efficient
processing accelerators. Focusing on the index-matching property in DNN, this article aims …
processing accelerators. Focusing on the index-matching property in DNN, this article aims …
Evaluation Metrics for the Cost of Data Movement in Deep Neural Network Acceleration
Hardware accelerators are designed to support a specialized processing dataflow for
everchanging deep neural networks (DNNs) under various processing environments. This …
everchanging deep neural networks (DNNs) under various processing environments. This …