Loraprune: Pruning meets low-rank parameter-efficient fine-tuning

M Zhang, H Chen, C Shen, Z Yang, L Ou, X Yu… - arXiv preprint arXiv …, 2023 - arxiv.org
Large pre-trained models (LPMs), such as LLaMA and GLM, have shown exceptional
performance across various tasks through fine-tuning. Although low-rank adaption (LoRA) …

Pruning's effect on generalization through the lens of training and regularization

T Jin, M Carbin, D Roy, J Frankle… - Advances in Neural …, 2022 - proceedings.neurips.cc
Practitioners frequently observe that pruning improves model generalization. A long-
standing hypothesis based on bias-variance trade-off attributes this generalization …

Fast as chita: Neural network pruning with combinatorial optimization

R Benbaki, W Chen, X Meng… - International …, 2023 - proceedings.mlr.press
The sheer size of modern neural networks makes model serving a serious computational
challenge. A popular class of compression techniques overcomes this challenge by pruning …

Singe: Sparsity via integrated gradients estimation of neuron relevance

E Yvinec, A Dapogny, M Cord… - Advances in Neural …, 2022 - proceedings.neurips.cc
The leap in performance in state-of-the-art computer vision methods is attributed to the
development of deep neural networks. However it often comes at a computational price …

FALCON: FLOP-Aware Combinatorial Optimization for Neural Network Pruning

X Meng, W Chen, R Benbaki… - International …, 2024 - proceedings.mlr.press
The increasing computational demands of modern neural networks present deployment
challenges on resource-constrained devices. Network pruning offers a solution to reduce …

Register Tiling for Unstructured Sparsity in Neural Network Inference

L Wilkinson, K Cheshmi, MM Dehnavi - Proceedings of the ACM on …, 2023 - dl.acm.org
Unstructured sparse neural networks are an important class of machine learning (ML)
models, as they compact model size and reduce floating point operations. The execution …

UFKT: Unimportant filters knowledge transfer for CNN pruning

CH Sarvani, SR Dubey, M Ghorai - Neurocomputing, 2022 - Elsevier
As the deep learning models have been widely used in recent years, there is a high demand
for reducing the model size in terms of memory and computation without much compromise …

MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning

M Farina, M Mancini, E Cunegatti… - Proceedings of the …, 2024 - openaccess.thecvf.com
While excellent in transfer learning Vision-Language models (VLMs) come with high
computational costs due to their large number of parameters. To address this issue …

UPSCALE: unconstrained channel pruning

A Wan, H Hao, K Patnaik, Y Xu… - International …, 2023 - proceedings.mlr.press
As neural networks grow in size and complexity, inference speeds decline. To combat this,
one of the most effective compression techniques–channel pruning–removes channels from …

Jointly Training and Pruning CNNs via Learnable Agent Guidance and Alignment

A Ganjdanesh, S Gao, H Huang - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Structural model pruning is a prominent approach used for reducing the computational cost
of Convolutional Neural Networks (CNNs) before their deployment on resource-constrained …