Model compression and acceleration for deep neural networks: The principles, progress, and challenges

Y Cheng, D Wang, P Zhou… - IEEE Signal Processing …, 2018 - ieeexplore.ieee.org
In recent years, deep neural networks (DNNs) have received increased attention, have been
applied to different applications, and achieved dramatic accuracy improvements in many …

Recent advances in convolutional neural network acceleration

Q Zhang, M Zhang, T Chen, Z Sun, Y Ma, B Yu - Neurocomputing, 2019 - Elsevier
In recent years, convolutional neural networks (CNNs) have shown great performance in
various fields such as image classification, pattern recognition, and multi-media …

Fnet: Mixing tokens with fourier transforms

J Lee-Thorp, J Ainslie, I Eckstein, S Ontanon - arXiv preprint arXiv …, 2021 - arxiv.org
We show that Transformer encoder architectures can be sped up, with limited accuracy
costs, by replacing the self-attention sublayers with simple linear transformations that" mix" …

Monarch mixer: A simple sub-quadratic gemm-based architecture

D Fu, S Arora, J Grogan, I Johnson… - Advances in …, 2024 - proceedings.neurips.cc
Abstract Machine learning models are increasingly being scaled in both sequence length
and model dimension to reach longer contexts and better performance. However, existing …

A survey of model compression and acceleration for deep neural networks

Y Cheng, D Wang, P Zhou, T Zhang - arXiv preprint arXiv:1710.09282, 2017 - arxiv.org
Deep neural networks (DNNs) have recently achieved great success in many visual
recognition tasks. However, existing deep neural network models are computationally …

Monarch: Expressive structured matrices for efficient and accurate training

T Dao, B Chen, NS Sohoni, A Desai… - International …, 2022 - proceedings.mlr.press
Large neural networks excel in many domains, but they are expensive to train and fine-tune.
A popular approach to reduce their compute or memory requirements is to replace dense …

Hypernetworks

D Ha, A Dai, QV Le - arXiv preprint arXiv:1609.09106, 2016 - arxiv.org
This work explores hypernetworks: an approach of using a one network, also known as a
hypernetwork, to generate the weights for another network. Hypernetworks provide an …

Recent advances in convolutional neural networks

J Gu, Z Wang, J Kuen, L Ma, A Shahroudy, B Shuai… - Pattern recognition, 2018 - Elsevier
In the last few years, deep learning has led to very good performance on a variety of
problems, such as visual recognition, speech recognition and natural language processing …

Generalisation error in learning with random features and the hidden manifold model

F Gerace, B Loureiro, F Krzakala… - International …, 2020 - proceedings.mlr.press
We study generalised linear regression and classification for a synthetically generated
dataset encompassing different problems of interest, such as learning with random features …

Modeling the influence of data structure on learning in neural networks: The hidden manifold model

S Goldt, M Mézard, F Krzakala, L Zdeborová - Physical Review X, 2020 - APS
Understanding the reasons for the success of deep neural networks trained using stochastic
gradient-based methods is a key open problem for the nascent theory of deep learning. The …