Efficient acceleration of deep learning inference on resource-constrained edge devices: A review
Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted
in breakthroughs in many areas. However, deploying these highly accurate models for data …
in breakthroughs in many areas. However, deploying these highly accurate models for data …
A comprehensive review of binary neural network
Deep learning (DL) has recently changed the development of intelligent systems and is
widely adopted in many real-life applications. Despite their various benefits and potentials …
widely adopted in many real-life applications. Despite their various benefits and potentials …
A survey of quantization methods for efficient neural network inference
This chapter provides approaches to the problem of quantizing the numerical values in deep
Neural Network computations, covering the advantages/disadvantages of current methods …
Neural Network computations, covering the advantages/disadvantages of current methods …
Pruning and quantization for deep neural network acceleration: A survey
T Liang, J Glossner, L Wang, S Shi, X Zhang - Neurocomputing, 2021 - Elsevier
Deep neural networks have been applied in many applications exhibiting extraordinary
abilities in the field of computer vision. However, complex network architectures challenge …
abilities in the field of computer vision. However, complex network architectures challenge …
Ghostnet: More features from cheap operations
Deploying convolutional neural networks (CNNs) on embedded devices is difficult due to the
limited memory and computation resources. The redundancy in feature maps is an important …
limited memory and computation resources. The redundancy in feature maps is an important …
Binary neural networks: A survey
The binary neural network, largely saving the storage and computation, serves as a
promising technique for deploying deep models on resource-limited devices. However, the …
promising technique for deploying deep models on resource-limited devices. However, the …
Single path one-shot neural architecture search with uniform sampling
We revisit the one-shot Neural Architecture Search (NAS) paradigm and analyze its
advantages over existing NAS approaches. Existing one-shot method, however, is hard to …
advantages over existing NAS approaches. Existing one-shot method, however, is hard to …
Reactnet: Towards precise binary neural network with generalized activation functions
In this paper, we propose several ideas for enhancing a binary network to close its accuracy
gap from real-valued networks without incurring any additional computational cost. We first …
gap from real-valued networks without incurring any additional computational cost. We first …
Binarybert: Pushing the limit of bert quantization
The rapid development of large pre-trained language models has greatly increased the
demand for model compression techniques, among which quantization is a popular solution …
demand for model compression techniques, among which quantization is a popular solution …
Metapruning: Meta learning for automatic neural network channel pruning
In this paper, we propose a novel meta learning approach for automatic channel pruning of
very deep neural networks. We first train a PruningNet, a kind of meta network, which is able …
very deep neural networks. We first train a PruningNet, a kind of meta network, which is able …