A survey on deep neural network pruning: Taxonomy, comparison, analysis, and recommendations
Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …
massive model sizes that require significant computational and storage resources. To …
A survey on lora of large language models
Y Mao, Y Ge, Y Fan, W Xu, Y Mi, Z Hu… - Frontiers of Computer …, 2025 - Springer
Abstract Low-Rank Adaptation (LoRA), which updates the dense neural network layers with
pluggable low-rank matrices, is one of the best performed parameter efficient fine-tuning …
pluggable low-rank matrices, is one of the best performed parameter efficient fine-tuning …
Assessing the brittleness of safety alignment via pruning and low-rank modifications
Large language models (LLMs) show inherent brittleness in their safety mechanisms, as
evidenced by their susceptibility to jailbreaking and even non-malicious fine-tuning. This …
evidenced by their susceptibility to jailbreaking and even non-malicious fine-tuning. This …
A survey on efficient inference for large language models
Large Language Models (LLMs) have attracted extensive attention due to their remarkable
performance across various tasks. However, the substantial computational and memory …
performance across various tasks. However, the substantial computational and memory …
Survey of different large language model architectures: Trends, benchmarks, and challenges
Large Language Models (LLMs) represent a class of deep learning models adept at
understanding natural language and generating coherent text in response to prompts or …
understanding natural language and generating coherent text in response to prompts or …
[PDF][PDF] The efficiency spectrum of large language models: An algorithmic survey
The rapid growth of Large Language Models (LLMs) has been a driving force in
transforming various domains, reshaping the artificial general intelligence landscape …
transforming various domains, reshaping the artificial general intelligence landscape …
Nuteprune: Efficient progressive pruning with numerous teachers for large language models
The considerable size of Large Language Models (LLMs) presents notable deployment
challenges, particularly on resource-constrained hardware. Structured pruning, offers an …
challenges, particularly on resource-constrained hardware. Structured pruning, offers an …
Model compression and efficient inference for large language models: A survey
Transformer based large language models have achieved tremendous success. However,
the significant memory and computational costs incurred during the inference process make …
the significant memory and computational costs incurred during the inference process make …
Compressing large language models by streamlining the unimportant layer
Large language models (LLM) have been extensively applied in various natural language
tasks and domains, but their applicability is constrained by the large number of parameters …
tasks and domains, but their applicability is constrained by the large number of parameters …
Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future Envision
Deep neural networks (DNNs) have recently achieved impressive success across a wide
range of real-world vision and language processing tasks, spanning from image …
range of real-world vision and language processing tasks, spanning from image …