State-of-the-art generalisation research in NLP: a taxonomy and review

D Hupkes, M Giulianelli, V Dankers, M Artetxe… - arXiv preprint arXiv …, 2022 - arxiv.org
The ability to generalise well is one of the primary desiderata of natural language
processing (NLP). Yet, what'good generalisation'entails and how it should be evaluated is …

[PDF][PDF] Findings of the BabyLM Challenge: Sample-efficient pretraining on developmentally plausible corpora

A Warstadt, A Mueller, L Choshen… - … of the BabyLM …, 2023 - research-collection.ethz.ch
Children can acquire language from less than 100 million words of input. Large language
models are far less data-efficient: they typically require 3 or 4 orders of magnitude more data …

A theory of emergent in-context learning as implicit structure induction

M Hahn, N Goyal - arXiv preprint arXiv:2303.07971, 2023 - arxiv.org
Scaling large language models (LLMs) leads to an emergent capacity to learn in-context
from example demonstrations. Despite progress, theoretical understanding of this …

Unit testing for concepts in neural networks

C Lovering, E Pavlick - Transactions of the Association for …, 2022 - direct.mit.edu
Many complex problems are naturally understood in terms of symbolic concepts. For
example, our concept of “cat” is related to our concepts of “ears” and “whiskers” in a non …

How abstract is linguistic generalization in large language models? Experiments with argument structure

M Wilson, J Petty, R Frank - Transactions of the Association for …, 2023 - direct.mit.edu
Abstract Language models are typically evaluated on their success at predicting the
distribution of specific words in specific contexts. Yet linguistic knowledge also encodes …

Grokking of hierarchical structure in vanilla transformers

S Murty, P Sharma, J Andreas, CD Manning - arXiv preprint arXiv …, 2023 - arxiv.org
For humans, language production and comprehension is sensitive to the hierarchical
structure of sentences. In natural language processing, past work has questioned how …

How poor is the stimulus? Evaluating hierarchical generalization in neural networks trained on child-directed speech

A Yedetore, T Linzen, R Frank, RT McCoy - arXiv preprint arXiv …, 2023 - arxiv.org
When acquiring syntax, children consistently choose hierarchical rules over competing non-
hierarchical possibilities. Is this preference due to a learning bias for hierarchical structure …

Language model acceptability judgements are not always robust to context

K Sinha, J Gauthier, A Mueller, K Misra… - arXiv preprint arXiv …, 2022 - arxiv.org
Targeted syntactic evaluations of language models ask whether models show stable
preferences for syntactically acceptable content over minimal-pair unacceptable inputs. Most …

The Impact of Depth on Compositional Generalization in Transformer Language Models

J Petty, S Steenkiste, I Dasgupta, F Sha… - Proceedings of the …, 2024 - aclanthology.org
To process novel sentences, language models (LMs) must generalize compositionally—
combine familiar elements in new ways. What aspects of a model's structure promote …

How to plant trees in language models: Data and architectural effects on the emergence of syntactic inductive biases

A Mueller, T Linzen - arXiv preprint arXiv:2305.19905, 2023 - arxiv.org
Accurate syntactic representations are essential for robust generalization in natural
language. Recent work has found that pre-training can teach language models to rely on …