Acdc: A structured efficient linear layer

Model compression and acceleration for deep neural networks: The principles, progress, and challenges

Y Cheng, D Wang, P Zhou… - IEEE Signal Processing …, 2018 - ieeexplore.ieee.org

In recent years, deep neural networks (DNNs) have received increased attention, have been
applied to different applications, and achieved dramatic accuracy improvements in many …

被引用次数：665 相关文章所有 5 个版本

[PDF] arxiv.org

Recent advances in convolutional neural network acceleration

Q Zhang, M Zhang, T Chen, Z Sun, Y Ma, B Yu - Neurocomputing, 2019 - Elsevier

In recent years, convolutional neural networks (CNNs) have shown great performance in
various fields such as image classification, pattern recognition, and multi-media …

被引用次数：451 相关文章所有 7 个版本

[PDF] arxiv.org

Fnet: Mixing tokens with fourier transforms

J Lee-Thorp, J Ainslie, I Eckstein, S Ontanon - arXiv preprint arXiv …, 2021 - arxiv.org

We show that Transformer encoder architectures can be sped up, with limited accuracy
costs, by replacing the self-attention sublayers with simple linear transformations that" mix" …

被引用次数：493 相关文章所有 7 个版本

[PDF] neurips.cc

Monarch mixer: A simple sub-quadratic gemm-based architecture

D Fu, S Arora, J Grogan, I Johnson… - Advances in …, 2024 - proceedings.neurips.cc

Abstract Machine learning models are increasingly being scaled in both sequence length
and model dimension to reach longer contexts and better performance. However, existing …

被引用次数：36 相关文章所有 6 个版本

[PDF] arxiv.org

A survey of model compression and acceleration for deep neural networks

Y Cheng, D Wang, P Zhou, T Zhang - arXiv preprint arXiv:1710.09282, 2017 - arxiv.org

Deep neural networks (DNNs) have recently achieved great success in many visual
recognition tasks. However, existing deep neural network models are computationally …

被引用次数：1360 相关文章所有 2 个版本

[PDF] mlr.press

Monarch: Expressive structured matrices for efficient and accurate training

T Dao, B Chen, NS Sohoni, A Desai… - International …, 2022 - proceedings.mlr.press

Large neural networks excel in many domains, but they are expensive to train and fine-tune.
A popular approach to reduce their compute or memory requirements is to replace dense …

被引用次数：81 相关文章所有 7 个版本

[PDF] arxiv.org

Hypernetworks

D Ha, A Dai, QV Le - arXiv preprint arXiv:1609.09106, 2016 - arxiv.org

This work explores hypernetworks: an approach of using a one network, also known as a
hypernetwork, to generate the weights for another network. Hypernetworks provide an …

被引用次数：1711 相关文章所有 9 个版本

[PDF] arxiv.org

Recent advances in convolutional neural networks

J Gu, Z Wang, J Kuen, L Ma, A Shahroudy, B Shuai… - Pattern recognition, 2018 - Elsevier

In the last few years, deep learning has led to very good performance on a variety of
problems, such as visual recognition, speech recognition and natural language processing …

被引用次数：6557 相关文章所有 7 个版本

[PDF] mlr.press

Generalisation error in learning with random features and the hidden manifold model

F Gerace, B Loureiro, F Krzakala… - International …, 2020 - proceedings.mlr.press

We study generalised linear regression and classification for a synthetically generated
dataset encompassing different problems of interest, such as learning with random features …

被引用次数：177 相关文章所有 15 个版本

[PDF] aps.org

Modeling the influence of data structure on learning in neural networks: The hidden manifold model

S Goldt, M Mézard, F Krzakala, L Zdeborová - Physical Review X, 2020 - APS

Understanding the reasons for the success of deep neural networks trained using stochastic
gradient-based methods is a key open problem for the nascent theory of deep learning. The …

被引用次数：186 相关文章所有 17 个版本