Unsupervised pretraining for sequence to sequence learning

X Qiu, T Sun, Y Xu, Y Shao, N Dai, X Huang - Science China …, 2020 - Springer

Recently, the emergence of pre-trained models (PTMs) has brought natural language
processing (NLP) to a new era. In this survey, we provide a comprehensive review of PTMs …

被引用次数：1729 相关文章所有 9 个版本

[PDF] arxiv.org

Neural machine translation for low-resource languages: A survey

S Ranathunga, ESA Lee, M Prifti Skenduli… - ACM Computing …, 2023 - dl.acm.org

Neural Machine Translation (NMT) has seen tremendous growth in the last ten years since
the early 2000s and has already entered a mature phase. While considered the most widely …

被引用次数：211 相关文章所有 6 个版本

[PDF] mlr.press

A general theoretical paradigm to understand learning from human preferences

MG Azar, ZD Guo, B Piot, R Munos… - International …, 2024 - proceedings.mlr.press

The prevalent deployment of learning from human preferences through reinforcement
learning (RLHF) relies on two important approximations: the first assumes that pairwise …

被引用次数：183 相关文章所有 4 个版本

[PDF] arxiv.org

Socratic models: Composing zero-shot multimodal reasoning with language

A Zeng, M Attarian, B Ichter, K Choromanski… - arXiv preprint arXiv …, 2022 - arxiv.org

Large pretrained (eg," foundation") models exhibit distinct capabilities depending on the
domain of data they are trained on. While these domains are generic, they may only barely …

被引用次数：418 相关文章所有 6 个版本

[PDF] pkwyx.com

[PDF][PDF] Multilingual denoising pre-training for neural machine translation

Y Liu - arXiv preprint arXiv:2001.08210, 2020 - fq.pkwyx.com

This paper demonstrates that multilingual denoising pre-training produces significant
performance gains across a wide variety of machine translation (MT) tasks. We present …

被引用次数：1729 相关文章

[PDF] mlr.press

Pegasus: Pre-training with extracted gap-sentences for abstractive summarization

J Zhang, Y Zhao, M Saleh, P Liu - … conference on machine …, 2020 - proceedings.mlr.press

Recent work pre-training Transformers with self-supervised objectives on large text corpora
has shown great success when fine-tuned on downstream NLP tasks including text …

被引用次数：2079 相关文章所有 9 个版本

[PDF] neurips.cc

Cross-lingual language model pretraining

A Conneau, G Lample - Advances in neural information …, 2019 - proceedings.neurips.cc

Recent studies have demonstrated the efficiency of generative pretraining for English
natural language understanding. In this work, we extend this approach to multiple …

被引用次数：1594 相关文章所有 4 个版本

[PDF] jmlr.org

Exploring the limits of transfer learning with a unified text-to-text transformer

C Raffel, N Shazeer, A Roberts, K Lee, S Narang… - Journal of machine …, 2020 - jmlr.org

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-
tuned on a downstream task, has emerged as a powerful technique in natural language …

被引用次数：17705 相关文章所有 15 个版本

[PDF] arxiv.org

Flaubert: Unsupervised language model pre-training for french

H Le, L Vial, J Frej, V Segonne, M Coavoux… - arXiv preprint arXiv …, 2019 - arxiv.org

Language models have become a key step to achieve state-of-the art results in many
different Natural Language Processing (NLP) tasks. Leveraging the huge amount of …

被引用次数：530 相关文章所有 8 个版本

[PDF] amazonaws.com

[PDF][PDF] Language models are unsupervised multitask learners

A Radford, J Wu, R Child, D Luan… - OpenAI …, 2019 - insightcivic.s3.us-east-1.amazonaws …

Natural language processing tasks, such as question answering, machine translation,
reading comprehension, and summarization, are typically approached with supervised …

被引用次数：13069 相关文章所有 31 个版本