Distill and replay for continual language learning

L Wang, X Zhang, H Su, J Zhu - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org

To cope with real-world dynamics, an intelligent system needs to incrementally acquire,
update, accumulate, and exploit knowledge throughout its lifetime. This ability, known as …

被引用次数：592 相关文章所有 6 个版本

[PDF] arxiv.org

Meditron-70b: Scaling medical pretraining for large language models

Z Chen, AH Cano, A Romanou, A Bonnet… - arXiv preprint arXiv …, 2023 - arxiv.org

Large language models (LLMs) can potentially democratize access to medical knowledge.
While many efforts have been made to harness and improve LLMs' medical knowledge and …

被引用次数：218 相关文章所有 3 个版本

[PDF] arxiv.org

Lfpt5: A unified framework for lifelong few-shot language learning based on prompt tuning of t5

C Qin, S Joty - arXiv preprint arXiv:2110.07298, 2021 - arxiv.org

Existing approaches to lifelong language learning rely on plenty of labeled data for learning
a new task, which is hard to obtain in most real scenarios. Considering that humans can …

被引用次数：85 相关文章所有 3 个版本

[PDF] arxiv.org

Computational models to study language processing in the human brain: A survey

S Wang, J Sun, Y Zhang, N Lin, MF Moens… - arXiv preprint arXiv …, 2024 - arxiv.org

Despite differing from the human language processing mechanism in implementation and
algorithms, current language models demonstrate remarkable human-like or surpassing …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

Continual sequence generation with adaptive compositional modules

Y Zhang, X Wang, D Yang - arXiv preprint arXiv:2203.10652, 2022 - arxiv.org

Continual learning is essential for real-world deployment when there is a need to quickly
adapt the model to new tasks without forgetting knowledge of old tasks. Existing work on …

被引用次数：45 相关文章所有 8 个版本

[PDF] arxiv.org

Contintin: Continual learning from task instructions

W Yin, J Li, C Xiong - arXiv preprint arXiv:2203.08512, 2022 - arxiv.org

The mainstream machine learning paradigms for NLP often work with two underlying
presumptions. First, the target task is predefined and static; a system merely needs to learn …

被引用次数：33 相关文章所有 5 个版本

[PDF] arxiv.org

Incremental prompting: Episodic memory prompt for lifelong event detection

M Liu, S Chang, L Huang - arXiv preprint arXiv:2204.07275, 2022 - arxiv.org

Lifelong event detection aims to incrementally update a model with new event types and
data while retaining the capability on previously learned old types. One critical challenge is …

被引用次数：28 相关文章所有 3 个版本

[PDF] arxiv.org

Fine-tuned vs. prompt-tuned supervised representations: Which better account for brain language representations?

J Sun, MF Moens - arXiv preprint arXiv:2310.01854, 2023 - arxiv.org

To decipher the algorithm underlying the human brain's language representation, previous
work probed brain responses to language input with pre-trained artificial neural network …

被引用次数：9 相关文章所有 4 个版本

[PDF] arxiv.org

Towards quantifiable dialogue coherence evaluation

Z Ye, L Lu, L Huang, L Lin, X Liang - arXiv preprint arXiv:2106.00507, 2021 - arxiv.org

Automatic dialogue coherence evaluation has attracted increasing attention and is crucial
for developing promising dialogue systems. However, existing metrics have two major …

被引用次数：21 相关文章所有 5 个版本

[PDF] openreview.net

Saullm-54b & saullm-141b: Scaling up domain adaptation for the legal domain

P Colombo, T Pires, M Boudiaf… - The Thirty-eighth …, 2024 - openreview.net

In this paper, we introduce SaulLM-medium and SaulLM-large, two large language models
(LLMs) families tailored for the legal sector. These models, which feature architectures of 54 …

被引用次数：4 相关文章所有 3 个版本