A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

Challenges and applications of large language models

J Kaddour, J Harris, M Mozes, H Bradley… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …

A simple and effective pruning approach for large language models

M Sun, Z Liu, A Bair, JZ Kolter - arXiv preprint arXiv:2306.11695, 2023 - arxiv.org
As their size increases, Large Languages Models (LLMs) are natural candidates for network
pruning methods: approaches that drop a subset of network weights while striving to …

Sheared llama: Accelerating language model pre-training via structured pruning

M Xia, T Gao, Z Zeng, D Chen - arXiv preprint arXiv:2310.06694, 2023 - arxiv.org
The popularity of LLaMA (Touvron et al., 2023a; b) and other recently emerged moderate-
sized large language models (LLMs) highlights the potential of building smaller yet powerful …

Structured information extraction from scientific text with large language models

J Dagdelen, A Dunn, S Lee, N Walker… - Nature …, 2024 - nature.com
Extracting structured knowledge from scientific text remains a challenging task for machine
learning models. Here, we present a simple approach to joint named entity recognition and …

A survey on model compression for large language models

X Zhu, J Li, Y Liu, C Ma, W Wang - arXiv preprint arXiv:2308.07633, 2023 - arxiv.org
Large Language Models (LLMs) have revolutionized natural language processing tasks with
remarkable success. However, their formidable size and computational demands present …

Diffusion model as representation learner

X Yang, X Wang - … of the IEEE/CVF International Conference …, 2023 - openaccess.thecvf.com
Abstract Diffusion Probabilistic Models (DPMs) have recently demonstrated impressive
results on various generative tasks. Despite its promises, the learned representations of pre …

Graphadapter: Tuning vision-language models with dual knowledge graph

X Li, D Lian, Z Lu, J Bai, Z Chen… - Advances in Neural …, 2024 - proceedings.neurips.cc
Adapter-style efficient transfer learning (ETL) has shown excellent performance in the tuning
of vision-language models (VLMs) under the low-data regime, where only a few additional …

Qa-lora: Quantization-aware low-rank adaptation of large language models

Y Xu, L Xie, X Gu, X Chen, H Chang, H Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org
Recently years have witnessed a rapid development of large language models (LLMs).
Despite the strong ability in many language-understanding tasks, the heavy computational …

A survey on deep neural network pruning: Taxonomy, comparison, analysis, and recommendations

H Cheng, M Zhang, JQ Shi - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org
Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …