A survey on deep neural network pruning: Taxonomy, comparison, analysis, and recommendations
Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …
massive model sizes that require significant computational and storage resources. To …
Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks
The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …
reduce the size of neural networks by selectively pruning components. Similarly to their …
Pruning neural networks without any data by iteratively conserving synaptic flow
Pruning the parameters of deep neural networks has generated intense interest due to
potential savings in time, memory and energy both during training and at test time. Recent …
potential savings in time, memory and energy both during training and at test time. Recent …
Sparse training via boosting pruning plasticity with neuroregeneration
Works on lottery ticket hypothesis (LTH) and single-shot network pruning (SNIP) have raised
a lot of attention currently on post-training pruning (iterative magnitude pruning), and before …
a lot of attention currently on post-training pruning (iterative magnitude pruning), and before …
The emergence of essential sparsity in large pre-trained models: The weights that matter
Large pre-trained transformers are $\textit {show-stealer} $ in modern-day deep learning,
and it becomes crucial to comprehend the parsimonious patterns that exist within them as …
and it becomes crucial to comprehend the parsimonious patterns that exist within them as …
Do we actually need dense over-parameterization? in-time over-parameterization in sparse training
In this paper, we introduce a new perspective on training deep neural networks capable of
state-of-the-art performance without the need for the expensive over-parameterization by …
state-of-the-art performance without the need for the expensive over-parameterization by …
Structural pruning via latency-saliency knapsack
Structural pruning can simplify network architecture and improve inference speed. We
propose Hardware-Aware Latency Pruning (HALP) that formulates structural pruning as a …
propose Hardware-Aware Latency Pruning (HALP) that formulates structural pruning as a …
Efficient joint optimization of layer-adaptive weight pruning in deep neural networks
In this paper, we propose a novel layer-adaptive weight-pruning approach for Deep Neural
Networks (DNNs) that addresses the challenge of optimizing the output distortion …
Networks (DNNs) that addresses the challenge of optimizing the output distortion …
When to prune? a policy towards early structural pruning
Pruning enables appealing reductions in network memory footprint and time complexity.
Conventional post-training pruning techniques lean towards efficient inference while …
Conventional post-training pruning techniques lean towards efficient inference while …
Winning the lottery ahead of time: Efficient early network pruning
Pruning, the task of sparsifying deep neural networks, received increasing attention recently.
Although state-of-the-art pruning methods extract highly sparse models, they neglect two …
Although state-of-the-art pruning methods extract highly sparse models, they neglect two …