A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

A comprehensive survey of continual learning: theory, method and application

L Wang, X Zhang, H Su, J Zhu - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org
To cope with real-world dynamics, an intelligent system needs to incrementally acquire,
update, accumulate, and exploit knowledge throughout its lifetime. This ability, known as …

The rise and potential of large language model based agents: A survey

Z Xi, W Chen, X Guo, W He, Y Ding, B Hong… - arXiv preprint arXiv …, 2023 - arxiv.org
For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing
the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are …

Domain specialization as the key to make large language models disruptive: A comprehensive survey

C Ling, X Zhao, J Lu, C Deng, C Zheng, J Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) have significantly advanced the field of natural language
processing (NLP), providing a highly useful, task-agnostic foundation for a wide range of …

A comprehensive survey of forgetting in deep learning beyond continual learning

Z Wang, E Yang, L Shen… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Forgetting refers to the loss or deterioration of previously acquired knowledge. While
existing surveys on forgetting have primarily focused on continual learning, forgetting is a …

Continual learning for large language models: A survey

T Wu, L Luo, YF Li, S Pan, TT Vu, G Haffari - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs) are not amenable to frequent re-training, due to high
training costs arising from their massive scale. However, updates are necessary to endow …

Orthogonal subspace learning for language model continual learning

X Wang, T Chen, Q Ge, H Xia, R Bao, R Zheng… - arXiv preprint arXiv …, 2023 - arxiv.org
Benefiting from massive corpora and advanced hardware, large language models (LLMs)
exhibit remarkable capabilities in language understanding and generation. However, their …

Sapt: A shared attention framework for parameter-efficient continual learning of large language models

W Zhao, S Wang, Y Hu, Y Zhao, B Qin… - Proceedings of the …, 2024 - aclanthology.org
The continual learning (CL) ability is vital for deploying large language models (LLMs) in the
dynamic world. Existing methods devise the learning module to acquire task-specific …

Residual prompt tuning: Improving prompt tuning with residual reparameterization

A Razdaibiedina, Y Mao, R Hou, M Khabsa… - arXiv preprint arXiv …, 2023 - arxiv.org
Prompt tuning is one of the successful approaches for parameter-efficient tuning of pre-
trained language models. Despite being arguably the most parameter-efficient (tuned soft …

Conpet: Continual parameter-efficient tuning for large language models

C Song, X Han, Z Zeng, K Li, C Chen, Z Liu… - arXiv preprint arXiv …, 2023 - arxiv.org
Continual learning necessitates the continual adaptation of models to newly emerging tasks
while minimizing the catastrophic forgetting of old ones. This is extremely challenging for …