Trends and challenges of real-time learning in large language models: A critical review
M Jovanovic, P Voss - arXiv preprint arXiv:2404.18311, 2024 - arxiv.org
Real-time learning concerns the ability of learning systems to acquire knowledge over time,
enabling their adaptation and generalization to novel tasks. It is a critical ability for …
enabling their adaptation and generalization to novel tasks. It is a critical ability for …
When llms meet cybersecurity: A systematic literature review
The rapid advancements in large language models (LLMs) have opened new avenues
across various fields, including cybersecurity, which faces an ever-evolving threat landscape …
across various fields, including cybersecurity, which faces an ever-evolving threat landscape …
Reuse, Don't Retrain: A Recipe for Continued Pretraining of Language Models
As language models have scaled both their number of parameters and pretraining dataset
sizes, the computational cost for pretraining has become intractable except for the most well …
sizes, the computational cost for pretraining has become intractable except for the most well …
Zamba: A Compact 7B SSM Hybrid Model
P Glorioso, Q Anthony, Y Tokpanov… - arXiv preprint arXiv …, 2024 - arxiv.org
In this technical report, we present Zamba, a novel 7B SSM-transformer hybrid model which
achieves competitive performance against leading open-weight models at a comparable …
achieves competitive performance against leading open-weight models at a comparable …
A Practitioner's Guide to Continual Multimodal Pretraining
Multimodal foundation models serve numerous applications at the intersection of vision and
language. Still, despite being pretrained on extensive data, they become outdated over time …
language. Still, despite being pretrained on extensive data, they become outdated over time …
CorDA: Context-Oriented Decomposition Adaptation of Large Language Models
Current parameter-efficient fine-tuning (PEFT) methods build adapters without considering
the context of downstream task to learn, or the context of important knowledge to maintain …
the context of downstream task to learn, or the context of important knowledge to maintain …
Towards Effective and Efficient Continual Pre-training of Large Language Models
Continual pre-training (CPT) has been an important approach for adapting language models
to specific domains or tasks. To make the CPT approach more traceable, this paper presents …
to specific domains or tasks. To make the CPT approach more traceable, this paper presents …
Mitigating Catastrophic Forgetting in Language Transfer via Model Merging
As open-weight large language models (LLMs) achieve ever more impressive performances
across a wide range of tasks in English, practitioners aim to adapt these models to different …
across a wide range of tasks in English, practitioners aim to adapt these models to different …
How Susceptible are LLMs to Influence in Prompts?
S Anagnostidis, J Bulian - arXiv preprint arXiv:2408.11865, 2024 - arxiv.org
Large Language Models (LLMs) are highly sensitive to prompts, including additional context
provided therein. As LLMs grow in capability, understanding their prompt-sensitivity …
provided therein. As LLMs grow in capability, understanding their prompt-sensitivity …
Demystifying Forgetting in Language Model Fine-Tuning with Statistical Analysis of Example Associations
Language models (LMs) are known to suffer from forgetting of previously learned examples
when fine-tuned, breaking stability of deployed LM systems. Despite efforts on mitigating …
when fine-tuned, breaking stability of deployed LM systems. Despite efforts on mitigating …