Lorashear: Efficient large language model structured pruning and knowledge recovery

H Cheng, M Zhang, JQ Shi - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org

Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …

被引用次数：104 相关文章所有 2 个版本

[PDF] springer.com

A survey on lora of large language models

Y Mao, Y Ge, Y Fan, W Xu, Y Mi, Z Hu… - Frontiers of Computer …, 2025 - Springer

Abstract Low-Rank Adaptation (LoRA), which updates the dense neural network layers with
pluggable low-rank matrices, is one of the best performed parameter efficient fine-tuning …

被引用次数：11 相关文章所有 5 个版本

[PDF] arxiv.org

Assessing the brittleness of safety alignment via pruning and low-rank modifications

B Wei, K Huang, Y Huang, T Xie, X Qi, M Xia… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models (LLMs) show inherent brittleness in their safety mechanisms, as
evidenced by their susceptibility to jailbreaking and even non-malicious fine-tuning. This …

被引用次数：62 相关文章所有 4 个版本

[PDF] arxiv.org

A survey on efficient inference for large language models

Z Zhou, X Ning, K Hong, T Fu, J Xu, S Li, Y Lou… - arXiv preprint arXiv …, 2024 - arxiv.org

Large Language Models (LLMs) have attracted extensive attention due to their remarkable
performance across various tasks. However, the substantial computational and memory …

被引用次数：67 相关文章所有 5 个版本

[PDF] ieee.org

Survey of different large language model architectures: Trends, benchmarks, and challenges

M Shao, A Basit, R Karri, M Shafique - IEEE Access, 2024 - ieeexplore.ieee.org

Large Language Models (LLMs) represent a class of deep learning models adept at
understanding natural language and generating coherent text in response to prompts or …

被引用次数：6 相关文章所有 5 个版本

[PDF] researchgate.net

[PDF][PDF] The efficiency spectrum of large language models: An algorithmic survey

T Ding, T Chen, H Zhu, J Jiang, Y Zhong… - arXiv preprint arXiv …, 2023 - researchgate.net

The rapid growth of Large Language Models (LLMs) has been a driving force in
transforming various domains, reshaping the artificial general intelligence landscape …

被引用次数：17 相关文章所有 3 个版本

[PDF] arxiv.org

Nuteprune: Efficient progressive pruning with numerous teachers for large language models

S Li, J Chen, X Han, J Bai - arXiv preprint arXiv:2402.09773, 2024 - arxiv.org

The considerable size of Large Language Models (LLMs) presents notable deployment
challenges, particularly on resource-constrained hardware. Structured pruning, offers an …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

Model compression and efficient inference for large language models: A survey

W Wang, W Chen, Y Luo, Y Long, Z Lin… - arXiv preprint arXiv …, 2024 - arxiv.org

Transformer based large language models have achieved tremendous success. However,
the significant memory and computational costs incurred during the inference process make …

被引用次数：24 相关文章所有 2 个版本

[PDF] arxiv.org

Compressing large language models by streamlining the unimportant layer

X Chen, Y Hu, J Zhang - arXiv preprint arXiv:2403.19135, 2024 - arxiv.org

Large language models (LLM) have been extensively applied in various natural language
tasks and domains, but their applicability is constrained by the large number of parameters …

被引用次数：13 相关文章所有 2 个版本

[PDF] acm.org

Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future Envision

X Luo, D Liu, H Kong, S Huai, H Chen… - ACM Transactions on …, 2024 - dl.acm.org

Deep neural networks (DNNs) have recently achieved impressive success across a wide
range of real-world vision and language processing tasks, spanning from image …

被引用次数：1 相关文章所有 4 个版本