Model compression and hardware acceleration for neural networks: A comprehensive survey

L Deng, G Li, S Han, L Shi, Y Xie - Proceedings of the IEEE, 2020 - ieeexplore.ieee.org
Domain-specific hardware is becoming a promising topic in the backdrop of improvement
slow down for general-purpose processors due to the foreseeable end of Moore's Law …

Normalization techniques in training dnns: Methodology, analysis and application

L Huang, J Qin, Y Zhou, F Zhu, L Liu… - IEEE transactions on …, 2023 - ieeexplore.ieee.org
Normalization techniques are essential for accelerating the training and improving the
generalization of deep neural networks (DNNs), and have successfully been used in various …

Analyzing and improving the training dynamics of diffusion models

T Karras, M Aittala, J Lehtinen… - Proceedings of the …, 2024 - openaccess.thecvf.com
Diffusion models currently dominate the field of data-driven image synthesis with their
unparalleled scaling to large datasets. In this paper we identify and rectify several causes for …

3d human pose estimation in video with temporal convolutions and semi-supervised training

D Pavllo, C Feichtenhofer… - Proceedings of the …, 2019 - openaccess.thecvf.com
In this work, we demonstrate that 3D poses in video can be effectively estimated with a fully
convolutional model based on dilated temporal convolutions over 2D keypoints. We also …

Study the influence of normalization/transformation process on the accuracy of supervised classification

VNG Raju, KP Lakshmi, VM Jain… - … on Smart Systems …, 2020 - ieeexplore.ieee.org
Recent developments in analytical technologies helped in developing applications for real-
time problems faced by industries. These applications are often found to consume more time …

Understanding the generalization benefit of normalization layers: Sharpness reduction

K Lyu, Z Li, S Arora - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Abstract Normalization layers (eg, Batch Normalization, Layer Normalization) were
introduced to help with optimization difficulties in very deep nets, but they clearly also help …

Additive powers-of-two quantization: An efficient non-uniform discretization for neural networks

Y Li, X Dong, W Wang - arXiv preprint arXiv:1909.13144, 2019 - arxiv.org
We propose Additive Powers-of-Two~(APoT) quantization, an efficient non-uniform
quantization scheme for the bell-shaped and long-tailed distribution of weights and …

Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks

N Wu, S Jastrzebski, K Cho… - … Conference on Machine …, 2022 - proceedings.mlr.press
We hypothesize that due to the greedy nature of learning in multi-modal deep neural
networks, these models tend to rely on just one modality while under-fitting the other …

Fixup initialization: Residual learning without normalization

H Zhang, YN Dauphin, T Ma - arXiv preprint arXiv:1901.09321, 2019 - arxiv.org
Normalization layers are a staple in state-of-the-art deep neural network architectures. They
are widely believed to stabilize training, enable higher learning rate, accelerate …

Graphnorm: A principled approach to accelerating graph neural network training

T Cai, S Luo, K Xu, D He, T Liu… - … Conference on Machine …, 2021 - proceedings.mlr.press
Normalization is known to help the optimization of deep neural networks. Curiously, different
architectures require specialized normalization methods. In this paper, we study what …