MUSCO: Multi-stage compression of neural networks

S Ahmadi-Asl, S Abukhovich, MG Asante-Mensah… - IEEE …, 2021 - ieeexplore.ieee.org

Big data analysis has become a crucial part of new emerging technologies such as the
internet of things, cyber-physical analysis, deep learning, anomaly detection, etc. Among …

被引用次数：79 相关文章所有 7 个版本

[PDF] wiley.com

A literature survey of matrix methods for data science

M Stoll - GAMM‐Mitteilungen, 2020 - Wiley Online Library

Efficient numerical linear algebra is a core ingredient in many applications across almost all
scientific and industrial disciplines. With this survey we want to illustrate that numerical linear …

被引用次数：23 相关文章所有 7 个版本

[PDF] arxiv.org

Towards compact neural networks via end-to-end training: A Bayesian tensor approach with automatic rank determination

C Hawkins, X Liu, Z Zhang - SIAM Journal on Mathematics of Data Science, 2022 - SIAM

Post-training model compression can reduce the inference costs of deep neural networks,
but uncompressed training still consumes enormous hardware resources and energy. To …

被引用次数：36 相关文章所有 8 个版本

[PDF] arxiv.org

Training acceleration of low-rank decomposed networks using sequential freezing and rank quantization

H Hajimolahoseini, W Ahmed, Y Liu - arXiv preprint arXiv:2309.03824, 2023 - arxiv.org

Low Rank Decomposition (LRD) is a model compression technique applied to the weight
tensors of deep learning models in order to reduce the number of trainable parameters and …

被引用次数：5 相关文章所有 2 个版本

[PDF] github.io

[PDF][PDF] Strategies for applying low rank decomposition to transformer-based models

H Hajimolahoseini, W Ahmed… - 36th Conference …, 2022 - neurips2022-enlsp.github.io

Low rank decomposition decomposes each fully-connected layer of the transformer modules
into two smaller layers using Singular Value Decomposition. The state-of-the-art techniques …

被引用次数：10 相关文章

[PDF] arxiv.org

Reduced-order modeling of deep neural networks

J Gusak, T Daulbaev, E Ponomarev, A Cichocki… - Computational …, 2021 - Springer

We introduce a new method for speeding up the inference of deep neural networks. It is
somewhat inspired by the reduced-order modeling techniques for dynamical systems. The …

被引用次数：11 相关文章所有 9 个版本

Edge AI–A Promising Technology

R Remya, S Nalesh, S Kala - Nanodevices for Integrated Circuit …, 2023 - Wiley Online Library

Summary Edge Artificial Intelligence (Edge AI) has become the buzzword for every industry
organization. Edge intelligence utilizes edge computing to access and analyze the data from …

Octave deep compression: In-parallel pruning-quantization on different frequencies

Q He, M Dong, L Schwiebert - 2021 IEEE 22nd International …, 2021 - ieeexplore.ieee.org

Though deep neural networks achieve great accuracy in visual recognition tasks, they
contain millions of weights and thus require a large space to be stored. This presents a …

被引用次数：1 相关文章所有 3 个版本

[PDF] proquest.com

[图书][B] Compressed Training for Uncertainty-Aware Compact Neural Networks

CP Hawkins - 2022 - search.proquest.com

The rising computational and memory demands of machine learning models, particularly in
resource-constrained edge-device settings, motivate us to develop compressed models that …

[PDF][PDF] Compressing neural networks through CP-decomposition

T Rudkiewicz - 2023 - perso.crans.org

Today neural networks are state of the art for numerous task including image classification.
The best neural networks use an enormous amount of parameters. Some works try to reduce …