oLMpics-on what language model pre-training captures

K Mahowald, AA Ivanova, IA Blank, N Kanwisher… - Trends in Cognitive …, 2024 - cell.com

Large language models (LLMs) have come closest among all models to date to mastering
human language, yet opinions about their linguistic and cognitive capabilities remain split …

被引用次数：289 相关文章所有 10 个版本

[PDF] arxiv.org

Recent advances in natural language processing via large pre-trained language models: A survey

B Min, H Ross, E Sulem, APB Veyseh… - ACM Computing …, 2023 - dl.acm.org

Large, pre-trained language models (PLMs) such as BERT and GPT have drastically
changed the Natural Language Processing (NLP) field. For numerous NLP tasks …

被引用次数：661 相关文章所有 5 个版本

[HTML] sciencedirect.com

[HTML][HTML] Pre-trained language models and their applications

H Wang, J Li, H Wu, E Hovy, Y Sun - Engineering, 2022 - Elsevier

Pre-trained language models have achieved striking success in natural language
processing (NLP), leading to a paradigm shift from supervised learning to pre-training …

被引用次数：156 相关文章所有 2 个版本

[PDF] arxiv.org

Learning how to ask: Querying LMs with mixtures of soft prompts

G Qin, J Eisner - arXiv preprint arXiv:2104.06599, 2021 - arxiv.org

Natural-language prompts have recently been used to coax pretrained language models
into performing other AI tasks, using a fill-in-the-blank paradigm (Petroni et al., 2019) or a …

被引用次数：455 相关文章所有 11 个版本

[PDF] arxiv.org

It's not just size that matters: Small language models are also few-shot learners

T Schick, H Schütze - arXiv preprint arXiv:2009.07118, 2020 - arxiv.org

When scaled to hundreds of billions of parameters, pretrained language models such as
GPT-3 (Brown et al., 2020) achieve remarkable few-shot performance. However, enormous …

被引用次数：834 相关文章所有 5 个版本

[PDF] arxiv.org

Leveraging passage retrieval with generative models for open domain question answering

G Izacard, E Grave - arXiv preprint arXiv:2007.01282, 2020 - arxiv.org

Generative models for open domain question answering have proven to be competitive,
without resorting to external knowledge. While promising, this approach requires to use …

被引用次数：873 相关文章所有 8 个版本

[PDF] arxiv.org

Codebert: A pre-trained model for programming and natural languages

Z Feng, D Guo, D Tang, N Duan, X Feng… - arXiv preprint arXiv …, 2020 - arxiv.org

We present CodeBERT, a bimodal pre-trained model for programming language (PL) and
nat-ural language (NL). CodeBERT learns general-purpose representations that support …

被引用次数：2223 相关文章所有 7 个版本

[PDF] mit.edu

A primer in BERTology: What we know about how BERT works

A Rogers, O Kovaleva, A Rumshisky - Transactions of the Association …, 2021 - direct.mit.edu

Transformer-based models have pushed state of the art in many areas of NLP, but our
understanding of what is behind their success is still limited. This paper is the first survey of …

被引用次数：1587 相关文章所有 12 个版本

[PDF] mit.edu

Measuring and improving consistency in pretrained language models

Y Elazar, N Kassner, S Ravfogel… - Transactions of the …, 2021 - direct.mit.edu

Consistency of a model—that is, the invariance of its behavior under meaning-preserving
alternations in its input—is a highly desirable property in natural language processing. In …

被引用次数：273 相关文章所有 11 个版本

[PDF] aclanthology.org

Exploiting cloze questions for few shot text classification and natural language inference

T Schick, H Schütze - arXiv preprint arXiv:2001.07676, 2020 - arxiv.org

Some NLP tasks can be solved in a fully unsupervised fashion by providing a pretrained
language model with" task descriptions" in natural language (eg, Radford et al., 2019). While …

被引用次数：1396 相关文章所有 7 个版本