Efficient acceleration of deep learning inference on resource-constrained edge devices: A review

MMH Shuvo, SK Islam, J Cheng… - Proceedings of the …, 2022 - ieeexplore.ieee.org
Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted
in breakthroughs in many areas. However, deploying these highly accurate models for data …

A comprehensive review of binary neural network

C Yuan, SS Agaian - Artificial Intelligence Review, 2023 - Springer
Deep learning (DL) has recently changed the development of intelligent systems and is
widely adopted in many real-life applications. Despite their various benefits and potentials …

A survey of quantization methods for efficient neural network inference

A Gholami, S Kim, Z Dong, Z Yao… - Low-Power Computer …, 2022 - taylorfrancis.com
This chapter provides approaches to the problem of quantizing the numerical values in deep
Neural Network computations, covering the advantages/disadvantages of current methods …

Pruning and quantization for deep neural network acceleration: A survey

T Liang, J Glossner, L Wang, S Shi, X Zhang - Neurocomputing, 2021 - Elsevier
Deep neural networks have been applied in many applications exhibiting extraordinary
abilities in the field of computer vision. However, complex network architectures challenge …

Ghostnet: More features from cheap operations

K Han, Y Wang, Q Tian, J Guo… - Proceedings of the …, 2020 - openaccess.thecvf.com
Deploying convolutional neural networks (CNNs) on embedded devices is difficult due to the
limited memory and computation resources. The redundancy in feature maps is an important …

Binary neural networks: A survey

H Qin, R Gong, X Liu, X Bai, J Song, N Sebe - Pattern Recognition, 2020 - Elsevier
The binary neural network, largely saving the storage and computation, serves as a
promising technique for deploying deep models on resource-limited devices. However, the …

Single path one-shot neural architecture search with uniform sampling

Z Guo, X Zhang, H Mu, W Heng, Z Liu, Y Wei… - Computer Vision–ECCV …, 2020 - Springer
We revisit the one-shot Neural Architecture Search (NAS) paradigm and analyze its
advantages over existing NAS approaches. Existing one-shot method, however, is hard to …

Reactnet: Towards precise binary neural network with generalized activation functions

Z Liu, Z Shen, M Savvides, KT Cheng - … Glasgow, UK, August 23–28, 2020 …, 2020 - Springer
In this paper, we propose several ideas for enhancing a binary network to close its accuracy
gap from real-valued networks without incurring any additional computational cost. We first …

Binarybert: Pushing the limit of bert quantization

H Bai, W Zhang, L Hou, L Shang, J Jin, X Jiang… - arXiv preprint arXiv …, 2020 - arxiv.org
The rapid development of large pre-trained language models has greatly increased the
demand for model compression techniques, among which quantization is a popular solution …

Metapruning: Meta learning for automatic neural network channel pruning

Z Liu, H Mu, X Zhang, Z Guo, X Yang… - Proceedings of the …, 2019 - openaccess.thecvf.com
In this paper, we propose a novel meta learning approach for automatic channel pruning of
very deep neural networks. We first train a PruningNet, a kind of meta network, which is able …