Model compression and hardware acceleration for neural networks: A comprehensive survey

L Deng, G Li, S Han, L Shi, Y Xie - Proceedings of the IEEE, 2020 - ieeexplore.ieee.org
Domain-specific hardware is becoming a promising topic in the backdrop of improvement
slow down for general-purpose processors due to the foreseeable end of Moore's Law …

A survey on approximate edge AI for energy efficient autonomous driving services

D Katare, D Perino, J Nurmi, M Warnier… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org
Autonomous driving services depends on active sensing from modules such as camera,
LiDAR, radar, and communication units. Traditionally, these modules process the sensed …

Merf: Memory-efficient radiance fields for real-time view synthesis in unbounded scenes

C Reiser, R Szeliski, D Verbin, P Srinivasan… - ACM Transactions on …, 2023 - dl.acm.org
Neural radiance fields enable state-of-the-art photorealistic view synthesis. However,
existing radiance field representations are either too compute-intensive for real-time …

A survey of quantization methods for efficient neural network inference

A Gholami, S Kim, Z Dong, Z Yao… - Low-Power Computer …, 2022 - taylorfrancis.com
This chapter provides approaches to the problem of quantizing the numerical values in deep
Neural Network computations, covering the advantages/disadvantages of current methods …

Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks

T Hoefler, D Alistarh, T Ben-Nun, N Dryden… - Journal of Machine …, 2021 - jmlr.org
The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …

Chasing sparsity in vision transformers: An end-to-end exploration

T Chen, Y Cheng, Z Gan, L Yuan… - Advances in Neural …, 2021 - proceedings.neurips.cc
Vision transformers (ViTs) have recently received explosive popularity, but their enormous
model sizes and training costs remain daunting. Conventional post-training pruning often …

Autorep: Automatic relu replacement for fast private network inference

H Peng, S Huang, T Zhou, Y Luo… - Proceedings of the …, 2023 - openaccess.thecvf.com
The growth of the Machine-Learning-As-A-Service (MLaaS) market has highlighted clients'
data privacy and security issues. Private inference (PI) techniques using cryptographic …

PAC-Bayes compression bounds so tight that they can explain generalization

S Lotfi, M Finzi, S Kapoor… - Advances in …, 2022 - proceedings.neurips.cc
While there has been progress in developing non-vacuous generalization bounds for deep
neural networks, these bounds tend to be uninformative about why deep learning works. In …

Enhance the visual representation via discrete adversarial training

X Mao, Y Chen, R Duan, Y Zhu, G Qi… - Advances in …, 2022 - proceedings.neurips.cc
Adversarial Training (AT), which is commonly accepted as one of the most effective
approaches defending against adversarial examples, can largely harm the standard …

Network quantization with element-wise gradient scaling

J Lee, D Kim, B Ham - … of the IEEE/CVF conference on …, 2021 - openaccess.thecvf.com
Network quantization aims at reducing bit-widths of weights and/or activations, particularly
important for implementing deep neural networks with limited hardware resources. Most …