Pruning and quantization for deep neural network acceleration: A survey

T Liang, J Glossner, L Wang, S Shi, X Zhang - Neurocomputing, 2021 - Elsevier
Deep neural networks have been applied in many applications exhibiting extraordinary
abilities in the field of computer vision. However, complex network architectures challenge …

Recu: Reviving the dead weights in binary neural networks

Z Xu, M Lin, J Liu, J Chen, L Shao… - Proceedings of the …, 2021 - openaccess.thecvf.com
Binary neural networks (BNNs) have received increasing attention due to their superior
reductions of computation and memory. Most existing works focus on either lessening the …

Mandheling: Mixed-precision on-device dnn training with dsp offloading

D Xu, M Xu, Q Wang, S Wang, Y Ma, K Huang… - Proceedings of the 28th …, 2022 - dl.acm.org
This paper proposes Mandheling, the first system that enables highly resource-efficient on-
device training by orchestrating mixed-precision training with on-chip Digital Signal …

Cambricon-Q: A hybrid architecture for efficient training

Y Zhao, C Liu, Z Du, Q Guo, X Hu… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org
Deep neural network (DNN) training is notoriously time-consuming, and quantization is
promising to improve the training efficiency with reduced bandwidth/storage requirements …

Siman: Sign-to-magnitude network binarization

M Lin, R Ji, Z Xu, B Zhang, F Chao… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Binary neural networks (BNNs) have attracted broad research interest due to their efficient
storage and computational ability. Nevertheless, a significant challenge of BNNs lies in …

Superbnn: Randomized binary neural network using adiabatic superconductor josephson devices

Z Li, G Yuan, T Yamauchi, Z Masoud, Y Xie… - Proceedings of the 56th …, 2023 - dl.acm.org
Adiabatic Quantum-Flux-Parametron (AQFP) is a superconducting logic with extremely high
energy efficiency. By employing the distinct polarity of current to denote logic '0'and '1', AQFP …

Fantastic4: A hardware-software co-design approach for efficiently running 4bit-compact multilayer perceptrons

S Wiedemann, S Shivapakash… - IEEE Open Journal …, 2021 - ieeexplore.ieee.org
With the growing demand for deploying Deep Learning models to the “edge”, it is paramount
to develop techniques that allow to execute state-of-the-art models within very tight and …

Design Exploration of In-Situ Error Correction for Multi-Bit Computation-in-Memory Circuits

TA Lin, PT Huang - 2024 IEEE Asia Pacific Conference on …, 2024 - ieeexplore.ieee.org
As computational complexity continues to rise, the effective design of computation-in-
memory (CIM) circuits using high-precision embedded non-volatile memory for …

Energy Efficient Hardware Architectures for Memory Prohibitive Deep Neural Networks

S Shivapakash - 2024 - search.proquest.com
Abstract Deep Neural Networks (DNN) form the backbone of modern Artificial Intelligence
(AI) systems. However, due to the high computational complexity and divergent shapes and …

[PDF][PDF] Energy Efficient Hardware Architectures for Memory Prohibitive Deep Neural Networks

M Tech - 2024 - depositonce.tu-berlin.de
Abstract Deep Neural Networks (DNN) form the backbone of modern Artificial Intelligence
(AI) systems. However, due to the high computational complexity and divergent shapes and …