Efficient joint optimization of layer-adaptive weight pruning in deep neural networks

K Xu, Z Wang, X Geng, M Wu, X Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
In this paper, we propose a novel layer-adaptive weight-pruning approach for Deep Neural
Networks (DNNs) that addresses the challenge of optimizing the output distortion …

An information-theoretic justification for model pruning

B Isik, T Weissman, A No - International Conference on …, 2022 - proceedings.mlr.press
We study the neural network (NN) compression problem, viewing the tension between the
compression ratio and NN performance through the lens of rate-distortion theory. We choose …

Entropy-constrained implicit neural representations for deep image compression

S Lee, JB Jeong, ES Ryu - IEEE Signal Processing Letters, 2023 - ieeexplore.ieee.org
Implicit neural representations (INRs) for various data types have gained popularity in the
field of deep learning owing to their effectiveness. However, previous studies on INRs have …

Rdo-q: Extremely fine-grained channel-wise quantization via rate-distortion optimization

Z Wang, J Lin, X Geng, MMS Aly… - European Conference on …, 2022 - Springer
Allocating different bit widths to different channels and quantizing them independently bring
higher quantization precision and accuracy. Most of prior works use equal bit width to …

From Algorithm to Hardware: A Survey on Efficient and Safe Deployment of Deep Neural Networks

X Geng, Z Wang, C Chen, Q Xu, K Xu… - … on Neural Networks …, 2024 - ieeexplore.ieee.org
Deep neural networks (DNNs) have been widely used in many artificial intelligence (AI)
tasks. However, deploying them brings significant challenges due to the huge cost of …

Bandwidth-Efficient Inference for Nerual Image Compression

S Yin, T Xu, Y Liang, Y Wang, Y Li… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
With neural networks growing deeper and feature maps growing larger, limited
communication bandwidth with external memory (or DRAM) and power constraints become …

Flexible Quantization for Efficient Convolutional Neural Networks

FG Zacchigna, S Lew, A Lutenberg - Electronics, 2024 - mdpi.com
This work focuses on the efficient quantization of convolutional neural networks (CNNs).
Specifically, we introduce a method called non-uniform uniform quantization (NUUQ), a …

Channel-Wise Bit Allocation for Deep Visual Feature Quantization

W Wang, Z Chen, Z Wang, J Lin, L Xu… - 2022 IEEE International …, 2022 - ieeexplore.ieee.org
Intermediate deep visual feature compression and transmission is an emerging research
topic, which enables a good balance among computing load, bandwidth usage and …

Mathematical Formalism for Memory Compression in Selective State Space Models

S Bhat - arXiv preprint arXiv:2410.03158, 2024 - arxiv.org
State space models (SSMs) have emerged as a powerful framework for modelling long-
range dependencies in sequence data. Unlike traditional recurrent neural networks (RNNs) …

Statistical Methods for Efficient and Trustworthy Machine Learning

B Isik - 2024 - search.proquest.com
STATISTICAL METHODS FOR EFFICIENT AND TRUSTWORTHY MACHINE LEARNING A
DISSERTATION SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGI Page 1 …