Pre-trained models for natural language processing: A survey

X Qiu, T Sun, Y Xu, Y Shao, N Dai, X Huang - Science China …, 2020 - Springer
Recently, the emergence of pre-trained models (PTMs) has brought natural language
processing (NLP) to a new era. In this survey, we provide a comprehensive review of PTMs …

Neural machine translation for low-resource languages: A survey

S Ranathunga, ESA Lee, M Prifti Skenduli… - ACM Computing …, 2023 - dl.acm.org
Neural Machine Translation (NMT) has seen tremendous growth in the last ten years since
the early 2000s and has already entered a mature phase. While considered the most widely …

A general theoretical paradigm to understand learning from human preferences

MG Azar, ZD Guo, B Piot, R Munos… - International …, 2024 - proceedings.mlr.press
The prevalent deployment of learning from human preferences through reinforcement
learning (RLHF) relies on two important approximations: the first assumes that pairwise …

Socratic models: Composing zero-shot multimodal reasoning with language

A Zeng, M Attarian, B Ichter, K Choromanski… - arXiv preprint arXiv …, 2022 - arxiv.org
Large pretrained (eg," foundation") models exhibit distinct capabilities depending on the
domain of data they are trained on. While these domains are generic, they may only barely …

[PDF][PDF] Multilingual denoising pre-training for neural machine translation

Y Liu - arXiv preprint arXiv:2001.08210, 2020 - fq.pkwyx.com
This paper demonstrates that multilingual denoising pre-training produces significant
performance gains across a wide variety of machine translation (MT) tasks. We present …

Pegasus: Pre-training with extracted gap-sentences for abstractive summarization

J Zhang, Y Zhao, M Saleh, P Liu - … conference on machine …, 2020 - proceedings.mlr.press
Recent work pre-training Transformers with self-supervised objectives on large text corpora
has shown great success when fine-tuned on downstream NLP tasks including text …

Cross-lingual language model pretraining

A Conneau, G Lample - Advances in neural information …, 2019 - proceedings.neurips.cc
Recent studies have demonstrated the efficiency of generative pretraining for English
natural language understanding. In this work, we extend this approach to multiple …

Exploring the limits of transfer learning with a unified text-to-text transformer

C Raffel, N Shazeer, A Roberts, K Lee, S Narang… - Journal of machine …, 2020 - jmlr.org
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-
tuned on a downstream task, has emerged as a powerful technique in natural language …

Flaubert: Unsupervised language model pre-training for french

H Le, L Vial, J Frej, V Segonne, M Coavoux… - arXiv preprint arXiv …, 2019 - arxiv.org
Language models have become a key step to achieve state-of-the art results in many
different Natural Language Processing (NLP) tasks. Leveraging the huge amount of …

[PDF][PDF] Language models are unsupervised multitask learners

A Radford, J Wu, R Child, D Luan… - OpenAI …, 2019 - insightcivic.s3.us-east-1.amazonaws …
Natural language processing tasks, such as question answering, machine translation,
reading comprehension, and summarization, are typically approached with supervised …