Investigating forgetting in pre-trained representations through continual learning

Y Luo, Z Yang, F Meng, Y Li, J Zhou… - arXiv preprint arXiv …, 2023 - arxiv.org

Catastrophic forgetting (CF) is a phenomenon that occurs in machine learning when a
model forgets previously learned information while acquiring new knowledge. As large …

被引用次数：192 相关文章所有 2 个版本

[PDF] arxiv.org

Orthogonal subspace learning for language model continual learning

X Wang, T Chen, Q Ge, H Xia, R Bao, R Zheng… - arXiv preprint arXiv …, 2023 - arxiv.org

Benefiting from massive corpora and advanced hardware, large language models (LLMs)
exhibit remarkable capabilities in language understanding and generation. However, their …

被引用次数：60 相关文章所有 4 个版本

[PDF] aclanthology.org

Sapt: A shared attention framework for parameter-efficient continual learning of large language models

W Zhao, S Wang, Y Hu, Y Zhao, B Qin… - Proceedings of the …, 2024 - aclanthology.org

The continual learning (CL) ability is vital for deploying large language models (LLMs) in the
dynamic world. Existing methods devise the learning module to acquire task-specific …

被引用次数：9 相关文章

[PDF] arxiv.org

Eight methods to evaluate robust unlearning in llms

A Lynch, P Guo, A Ewart, S Casper… - arXiv preprint arXiv …, 2024 - arxiv.org

Machine unlearning can be useful for removing harmful capabilities and memorized text
from large language models (LLMs), but there are not yet standardized methods for …

被引用次数：37 相关文章所有 2 个版本

[PDF] arxiv.org

Defending Against Unforeseen Failure Modes with Latent Adversarial Training

S Casper, L Schulze, O Patel… - arXiv preprint arXiv …, 2024 - arxiv.org

AI systems sometimes exhibit harmful unintended behaviors post-deployment. This is often
despite extensive diagnostics and debugging by developers. Minimizing risks from models …

被引用次数：15 相关文章所有 2 个版本

[PDF] arxiv.org

Balancing speciality and versatility: a coarse to fine framework for supervised fine-tuning large language model

H Zhang, Y Wu, D Li, S Yang, R Zhao, Y Jiang… - arXiv preprint arXiv …, 2024 - arxiv.org

Aligned Large Language Models (LLMs) showcase remarkable versatility, capable of
handling diverse real-world tasks. Meanwhile, aligned LLMs are also expected to exhibit …

被引用次数：7 相关文章所有 2 个版本

[PDF] arxiv.org

Localize-and-stitch: Efficient model merging via sparse task arithmetic

Y He, Y Hu, Y Lin, T Zhang, H Zhao - arXiv preprint arXiv:2408.13656, 2024 - arxiv.org

Model merging offers an effective strategy to combine the strengths of multiple finetuned
models into a unified model that preserves the specialized capabilities of each. Existing …

被引用次数：4 相关文章所有 2 个版本

[PDF] arxiv.org

DAPT: A Dual Attention Framework for Parameter-Efficient Continual Learning of Large Language Models

W Zhao, S Wang, Y Hu, Y Zhao, B Qin, X Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

The continual learning (CL) ability is vital for deploying large language models (LLMs) in the
dynamic world. Based on parameter-efficient tuning (PET), existing methods devise the …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

Latent Adversarial Training Improves Robustness to Persistent Harmful Behaviors in LLMs

A Sheshadri, A Ewart, P Guo, A Lynch, C Wu… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models (LLMs) can often be made to behave in undesirable ways that they
are explicitly fine-tuned not to. For example, the LLM red-teaming literature has produced a …

[PDF] arxiv.org

Investigating Continual Pretraining in Large Language Models: Insights and Implications

Ç Yıldız, NK Ravichandran, P Punia, M Bethge… - arXiv preprint arXiv …, 2024 - arxiv.org

This paper studies the evolving domain of Continual Learning (CL) in large language
models (LLMs), with a focus on developing strategies for efficient and sustainable training …

被引用次数：23 相关文章所有 3 个版本