Language models are few-shot learners

T Brown, B Mann, N Ryder… - Advances in neural …, 2020 - proceedings.neurips.cc
We demonstrate that scaling up language models greatly improves task-agnostic, few-shot
performance, sometimes even becoming competitive with prior state-of-the-art fine-tuning …

It's not just size that matters: Small language models are also few-shot learners

T Schick, H Schütze - arXiv preprint arXiv:2009.07118, 2020 - arxiv.org
When scaled to hundreds of billions of parameters, pretrained language models such as
GPT-3 (Brown et al., 2020) achieve remarkable few-shot performance. However, enormous …

Making pre-trained language models better few-shot learners

T Gao, A Fisch, D Chen - arXiv preprint arXiv:2012.15723, 2020 - arxiv.org
The recent GPT-3 model (Brown et al., 2020) achieves remarkable few-shot performance
solely by leveraging a natural-language prompt and a few task demonstrations as input …

Palm: Scaling language modeling with pathways

A Chowdhery, S Narang, J Devlin, M Bosma… - Journal of Machine …, 2023 - jmlr.org
Large language models have been shown to achieve remarkable performance across a
variety of natural language tasks using few-shot learning, which drastically reduces the …

Calibrate before use: Improving few-shot performance of language models

Z Zhao, E Wallace, S Feng, D Klein… - … on machine learning, 2021 - proceedings.mlr.press
GPT-3 can perform numerous tasks when provided a natural language prompt that contains
a few training examples. We show that this type of few-shot learning can be unstable: the …

Revisiting self-training for few-shot learning of language model

Y Chen, Y Zhang, C Zhang, G Lee, R Cheng… - arXiv preprint arXiv …, 2021 - arxiv.org
As unlabeled data carry rich task-relevant information, they are proven useful for few-shot
learning of language model. The question is how to effectively make use of such data. In this …

Perfect: Prompt-free and efficient few-shot learning with language models

RK Mahabadi, L Zettlemoyer, J Henderson… - arXiv preprint arXiv …, 2022 - arxiv.org
Current methods for few-shot fine-tuning of pretrained masked language models (PLMs)
require carefully engineered prompts and verbalizers for each new task to convert examples …

Few-shot learning with multilingual generative language models

XV Lin, T Mihaylov, M Artetxe, T Wang… - Proceedings of the …, 2022 - aclanthology.org
Large-scale generative language models such as GPT-3 are competitive few-shot learners.
While these models are known to be able to jointly represent many different languages, their …

Meta-learning for few-shot natural language processing: A survey

W Yin - arXiv preprint arXiv:2007.09604, 2020 - arxiv.org
Few-shot natural language processing (NLP) refers to NLP tasks that are accompanied with
merely a handful of labeled examples. This is a real-world challenge that an AI system must …

Atlas: Few-shot learning with retrieval augmented language models

G Izacard, P Lewis, M Lomeli, L Hosseini… - Journal of Machine …, 2023 - jmlr.org
Large language models have shown impressive few-shot results on a wide range of tasks.
However, when knowledge is key for such results, as is the case for tasks such as question …