Efficient joint optimization of layer-adaptive weight pruning in deep neural networks
In this paper, we propose a novel layer-adaptive weight-pruning approach for Deep Neural
Networks (DNNs) that addresses the challenge of optimizing the output distortion …
Networks (DNNs) that addresses the challenge of optimizing the output distortion …
An information-theoretic justification for model pruning
We study the neural network (NN) compression problem, viewing the tension between the
compression ratio and NN performance through the lens of rate-distortion theory. We choose …
compression ratio and NN performance through the lens of rate-distortion theory. We choose …
Entropy-constrained implicit neural representations for deep image compression
Implicit neural representations (INRs) for various data types have gained popularity in the
field of deep learning owing to their effectiveness. However, previous studies on INRs have …
field of deep learning owing to their effectiveness. However, previous studies on INRs have …
Rdo-q: Extremely fine-grained channel-wise quantization via rate-distortion optimization
Allocating different bit widths to different channels and quantizing them independently bring
higher quantization precision and accuracy. Most of prior works use equal bit width to …
higher quantization precision and accuracy. Most of prior works use equal bit width to …
From Algorithm to Hardware: A Survey on Efficient and Safe Deployment of Deep Neural Networks
Deep neural networks (DNNs) have been widely used in many artificial intelligence (AI)
tasks. However, deploying them brings significant challenges due to the huge cost of …
tasks. However, deploying them brings significant challenges due to the huge cost of …
Bandwidth-Efficient Inference for Nerual Image Compression
With neural networks growing deeper and feature maps growing larger, limited
communication bandwidth with external memory (or DRAM) and power constraints become …
communication bandwidth with external memory (or DRAM) and power constraints become …
Flexible Quantization for Efficient Convolutional Neural Networks
FG Zacchigna, S Lew, A Lutenberg - Electronics, 2024 - mdpi.com
This work focuses on the efficient quantization of convolutional neural networks (CNNs).
Specifically, we introduce a method called non-uniform uniform quantization (NUUQ), a …
Specifically, we introduce a method called non-uniform uniform quantization (NUUQ), a …
Channel-Wise Bit Allocation for Deep Visual Feature Quantization
Intermediate deep visual feature compression and transmission is an emerging research
topic, which enables a good balance among computing load, bandwidth usage and …
topic, which enables a good balance among computing load, bandwidth usage and …
Mathematical Formalism for Memory Compression in Selective State Space Models
S Bhat - arXiv preprint arXiv:2410.03158, 2024 - arxiv.org
State space models (SSMs) have emerged as a powerful framework for modelling long-
range dependencies in sequence data. Unlike traditional recurrent neural networks (RNNs) …
range dependencies in sequence data. Unlike traditional recurrent neural networks (RNNs) …
Statistical Methods for Efficient and Trustworthy Machine Learning
B Isik - 2024 - search.proquest.com
STATISTICAL METHODS FOR EFFICIENT AND TRUSTWORTHY MACHINE LEARNING A
DISSERTATION SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGI Page 1 …
DISSERTATION SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGI Page 1 …