Dissociating language and thought in large language models

K Mahowald, AA Ivanova, IA Blank, N Kanwisher… - Trends in Cognitive …, 2024 - cell.com
Large language models (LLMs) have come closest among all models to date to mastering
human language, yet opinions about their linguistic and cognitive capabilities remain split …

Recent advances in natural language processing via large pre-trained language models: A survey

B Min, H Ross, E Sulem, APB Veyseh… - ACM Computing …, 2023 - dl.acm.org
Large, pre-trained language models (PLMs) such as BERT and GPT have drastically
changed the Natural Language Processing (NLP) field. For numerous NLP tasks …

[HTML][HTML] Pre-trained language models and their applications

H Wang, J Li, H Wu, E Hovy, Y Sun - Engineering, 2022 - Elsevier
Pre-trained language models have achieved striking success in natural language
processing (NLP), leading to a paradigm shift from supervised learning to pre-training …

Learning how to ask: Querying LMs with mixtures of soft prompts

G Qin, J Eisner - arXiv preprint arXiv:2104.06599, 2021 - arxiv.org
Natural-language prompts have recently been used to coax pretrained language models
into performing other AI tasks, using a fill-in-the-blank paradigm (Petroni et al., 2019) or a …

It's not just size that matters: Small language models are also few-shot learners

T Schick, H Schütze - arXiv preprint arXiv:2009.07118, 2020 - arxiv.org
When scaled to hundreds of billions of parameters, pretrained language models such as
GPT-3 (Brown et al., 2020) achieve remarkable few-shot performance. However, enormous …

Leveraging passage retrieval with generative models for open domain question answering

G Izacard, E Grave - arXiv preprint arXiv:2007.01282, 2020 - arxiv.org
Generative models for open domain question answering have proven to be competitive,
without resorting to external knowledge. While promising, this approach requires to use …

Codebert: A pre-trained model for programming and natural languages

Z Feng, D Guo, D Tang, N Duan, X Feng… - arXiv preprint arXiv …, 2020 - arxiv.org
We present CodeBERT, a bimodal pre-trained model for programming language (PL) and
nat-ural language (NL). CodeBERT learns general-purpose representations that support …

A primer in BERTology: What we know about how BERT works

A Rogers, O Kovaleva, A Rumshisky - Transactions of the Association …, 2021 - direct.mit.edu
Transformer-based models have pushed state of the art in many areas of NLP, but our
understanding of what is behind their success is still limited. This paper is the first survey of …

Measuring and improving consistency in pretrained language models

Y Elazar, N Kassner, S Ravfogel… - Transactions of the …, 2021 - direct.mit.edu
Consistency of a model—that is, the invariance of its behavior under meaning-preserving
alternations in its input—is a highly desirable property in natural language processing. In …

Exploiting cloze questions for few shot text classification and natural language inference

T Schick, H Schütze - arXiv preprint arXiv:2001.07676, 2020 - arxiv.org
Some NLP tasks can be solved in a fully unsupervised fashion by providing a pretrained
language model with" task descriptions" in natural language (eg, Radford et al., 2019). While …