Pruning and quantization for deep neural network acceleration: A survey

T Liang, J Glossner, L Wang, S Shi, X Zhang - Neurocomputing, 2021 - Elsevier
Deep neural networks have been applied in many applications exhibiting extraordinary
abilities in the field of computer vision. However, complex network architectures challenge …

Recu: Reviving the dead weights in binary neural networks

Z Xu, M Lin, J Liu, J Chen, L Shao… - Proceedings of the …, 2021 - openaccess.thecvf.com
Binary neural networks (BNNs) have received increasing attention due to their superior
reductions of computation and memory. Most existing works focus on either lessening the …

Mandheling: Mixed-precision on-device dnn training with dsp offloading

D Xu, M Xu, Q Wang, S Wang, Y Ma, K Huang… - Proceedings of the 28th …, 2022 - dl.acm.org
This paper proposes Mandheling, the first system that enables highly resource-efficient on-
device training by orchestrating mixed-precision training with on-chip Digital Signal …

Cambricon-Q: A hybrid architecture for efficient training

Y Zhao, C Liu, Z Du, Q Guo, X Hu… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org
Deep neural network (DNN) training is notoriously time-consuming, and quantization is
promising to improve the training efficiency with reduced bandwidth/storage requirements …

Siman: Sign-to-magnitude network binarization

M Lin, R Ji, Z Xu, B Zhang, F Chao… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Binary neural networks (BNNs) have attracted broad research interest due to their efficient
storage and computational ability. Nevertheless, a significant challenge of BNNs lies in …

Acceleration unit for a deep learning engine

SP Singh, T Boesch, G Desoli - US Patent 11,687,762, 2023 - Google Patents
Embodiments of a device include an integrated circuit, a reconfigurable stream switch
formed in the integrated circuit along with a plurality of convolution accelerators and an …

Arithmetic unit for deep learning acceleration

SP Singh, G Desoli, T Boesch - US Patent 11,586,907, 2023 - Google Patents
(57) APSTRACT Embodiments of a device include an integrated circuit, a reconfigurable
stream switch formed in the integrated circuit, and an arithmetic unit coupled to the …

Superbnn: Randomized binary neural network using adiabatic superconductor josephson devices

Z Li, G Yuan, T Yamauchi, Z Masoud, Y Xie… - Proceedings of the 56th …, 2023 - dl.acm.org
Adiabatic Quantum-Flux-Parametron (AQFP) is a superconducting logic with extremely high
energy efficiency. By employing the distinct polarity of current to denote logic '0'and '1', AQFP …

Accurate neural training with 4-bit matrix multiplications at standard formats

B Chmiel, R Banner, E Hoffer… - The Eleventh …, 2023 - openreview.net
Quantization of the weights and activations is one of the main methods to reduce the
computational footprint of Deep Neural Networks (DNNs) training. Current methods enable 4 …

Fantastic4: A hardware-software co-design approach for efficiently running 4bit-compact multilayer perceptrons

S Wiedemann, S Shivapakash… - IEEE Open Journal …, 2021 - ieeexplore.ieee.org
With the growing demand for deploying Deep Learning models to the “edge”, it is paramount
to develop techniques that allow to execute state-of-the-art models within very tight and …