Bridging the accuracy gap for 2-bit quantized neural networks (qnn)

N Verma, H Jia, H Valavi, Y Tang… - IEEE Solid-State …, 2019 - ieeexplore.ieee.org

IMC has the potential to address a critical and foundational challenge affecting computing
platforms today-that is, the high energy and delay costs of moving data and accessing data …

被引用次数：381 相关文章所有 5 个版本

[PDF] arxiv.org

Learned step size quantization

SK Esser, JL McKinstry, D Bablani… - arXiv preprint arXiv …, 2019 - arxiv.org

Deep networks run with low precision operations at inference time offer power and space
advantages over high precision alternatives, but need to overcome the challenge of …

被引用次数：892 相关文章所有 6 个版本

[PDF] thecvf.com

Differentiable soft quantization: Bridging full-precision and low-bit neural networks

R Gong, X Liu, S Jiang, T Li, P Hu… - Proceedings of the …, 2019 - openaccess.thecvf.com

Hardware-friendly network quantization (eg, binary/uniform quantization) can efficiently
accelerate the inference and meanwhile reduce memory consumption of the deep neural …

被引用次数：550 相关文章所有 12 个版本

[PDF] arxiv.org

Additive powers-of-two quantization: An efficient non-uniform discretization for neural networks

Y Li, X Dong, W Wang - arXiv preprint arXiv:1909.13144, 2019 - arxiv.org

We propose Additive Powers-of-Two~(APoT) quantization, an efficient non-uniform
quantization scheme for the bell-shaped and long-tailed distribution of weights and …

被引用次数：351 相关文章所有 5 个版本

[PDF] mdpi.com

Quantization and deployment of deep neural networks on microcontrollers

PE Novac, G Boukli Hacene, A Pegatoquet… - Sensors, 2021 - mdpi.com

Embedding Artificial Intelligence onto low-power devices is a challenging task that has been
partly overcome with recent advances in machine learning and hardware design. Presently …

被引用次数：168 相关文章所有 36 个版本

[PDF] thecvf.com

Towards efficient model compression via learned global ranking

TW Chin, R Ding, C Zhang… - Proceedings of the …, 2020 - openaccess.thecvf.com

Pruning convolutional filters has demonstrated its effectiveness in compressing ConvNets.
Prior art in filter pruning requires users to specify a target model complexity (eg, model size …

被引用次数：215 相关文章所有 8 个版本

A review of AI edge devices and lightweight CNN deployment

K Sun, X Wang, X Miao, Q Zhao - Neurocomputing, 2024 - Elsevier

Abstract Artificial Intelligence of Things (AIoT) which integrates artificial intelligence (AI) and
the Internet of Things (IoT), has attracted increasing attention recently. With the remarkable …

被引用次数：1 相关文章

[PDF] frontiersin.org

Mixed-precision deep learning based on computational memory

SR Nandakumar, M Le Gallo, C Piveteau… - Frontiers in …, 2020 - frontiersin.org

Deep neural networks (DNNs) have revolutionized the field of artificial intelligence and have
achieved unprecedented success in cognitive tasks such as image and speech recognition …

被引用次数：116 相关文章所有 16 个版本

[PDF] thecvf.com

Deepshift: Towards multiplication-less neural networks

M Elhoushi, Z Chen, F Shafiq… - Proceedings of the …, 2021 - openaccess.thecvf.com

The high computation, memory, and power budgets of inferring convolutional neural
networks (CNNs) are major bottlenecks of model deployment to edge computing platforms …

被引用次数：128 相关文章所有 11 个版本

[PDF] neurips.cc

Sparse weight activation training

MA Raihan, T Aamodt - Advances in Neural Information …, 2020 - proceedings.neurips.cc

Neural network training is computationally and memory intensive. Sparse training can
reduce the burden on emerging hardware platforms designed to accelerate sparse …

被引用次数：93 相关文章所有 9 个版本