Model compression and hardware acceleration for neural networks: A comprehensive survey
Domain-specific hardware is becoming a promising topic in the backdrop of improvement
slow down for general-purpose processors due to the foreseeable end of Moore's Law …
slow down for general-purpose processors due to the foreseeable end of Moore's Law …
A survey on approximate edge AI for energy efficient autonomous driving services
Autonomous driving services depends on active sensing from modules such as camera,
LiDAR, radar, and communication units. Traditionally, these modules process the sensed …
LiDAR, radar, and communication units. Traditionally, these modules process the sensed …
Merf: Memory-efficient radiance fields for real-time view synthesis in unbounded scenes
Neural radiance fields enable state-of-the-art photorealistic view synthesis. However,
existing radiance field representations are either too compute-intensive for real-time …
existing radiance field representations are either too compute-intensive for real-time …
A survey of quantization methods for efficient neural network inference
This chapter provides approaches to the problem of quantizing the numerical values in deep
Neural Network computations, covering the advantages/disadvantages of current methods …
Neural Network computations, covering the advantages/disadvantages of current methods …
Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks
The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …
reduce the size of neural networks by selectively pruning components. Similarly to their …
Chasing sparsity in vision transformers: An end-to-end exploration
Vision transformers (ViTs) have recently received explosive popularity, but their enormous
model sizes and training costs remain daunting. Conventional post-training pruning often …
model sizes and training costs remain daunting. Conventional post-training pruning often …
Autorep: Automatic relu replacement for fast private network inference
The growth of the Machine-Learning-As-A-Service (MLaaS) market has highlighted clients'
data privacy and security issues. Private inference (PI) techniques using cryptographic …
data privacy and security issues. Private inference (PI) techniques using cryptographic …
PAC-Bayes compression bounds so tight that they can explain generalization
While there has been progress in developing non-vacuous generalization bounds for deep
neural networks, these bounds tend to be uninformative about why deep learning works. In …
neural networks, these bounds tend to be uninformative about why deep learning works. In …
Enhance the visual representation via discrete adversarial training
Adversarial Training (AT), which is commonly accepted as one of the most effective
approaches defending against adversarial examples, can largely harm the standard …
approaches defending against adversarial examples, can largely harm the standard …
Network quantization with element-wise gradient scaling
Network quantization aims at reducing bit-widths of weights and/or activations, particularly
important for implementing deep neural networks with limited hardware resources. Most …
important for implementing deep neural networks with limited hardware resources. Most …