An Efficient CNN Inference Accelerator Based on Intra-and Inter-Channel Feature Map Compression

C Xie, Z Shao, N Zhao, Y Du… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Deep convolutional neural networks (CNNs) generate intensive inter-layer data during
inference, which results in substantial on-chip memory size and off-chip bandwidth. To solve …

[HTML][HTML] Minimizing Global Buffer Access in a Deep Learning Accelerator Using a Local Register File with a Rearranged Computational Sequence

M Lee, Z Zhang, S Choi, J Choi - Sensors, 2022 - mdpi.com
We propose a method for minimizing global buffer access within a deep learning accelerator
for convolution operations by maximizing the data reuse through a local register file, thereby …

MOSDA: On-Chip Memory Optimized Sparse Deep Neural Network Accelerator with Efficient Index Matching

H Xu, J Shiomi, H Onodera - IEEE Open Journal of Circuits and …, 2020 - ieeexplore.ieee.org
The irregular data access pattern caused by sparsity brings great challenges to efficient
processing accelerators. Focusing on the index-matching property in DNN, this article aims …

Evaluation Metrics for the Cost of Data Movement in Deep Neural Network Acceleration

H Xu, J Shiomi, H Onodera - IEICE Transactions on Fundamentals …, 2021 - search.ieice.org
Hardware accelerators are designed to support a specialized processing dataflow for
everchanging deep neural networks (DNNs) under various processing environments. This …