Reft: Representation finetuning for language models

Z Wu, A Arora, Z Wang, A Geiger, D Jurafsky… - arXiv preprint arXiv …, 2024 - arxiv.org
Parameter-efficient fine-tuning (PEFT) methods seek to adapt large models via updates to a
small number of weights. However, much prior interpretability work has shown that …

Philosophy of cognitive science in the age of deep learning

R Millière - Wiley Interdisciplinary Reviews: Cognitive Science, 2024 - Wiley Online Library
Deep learning has enabled major advances across most areas of artificial intelligence
research. This remarkable progress extends beyond mere engineering achievements and …

Language models as models of language

R Millière - arXiv preprint arXiv:2408.07144, 2024 - arxiv.org
This chapter critically examines the potential contributions of modern language models to
theoretical linguistics. Despite their focus on engineering goals, these models' ability to …

CausalGym: Benchmarking causal interpretability methods on linguistic tasks

A Arora, D Jurafsky, C Potts - arXiv preprint arXiv:2402.12560, 2024 - arxiv.org
Language models (LMs) have proven to be powerful tools for psycholinguistic research, but
most prior work has focused on purely behavioural measures (eg, surprisal comparisons). At …

The limitations of large language models for understanding human language and cognition

C Cuskley, R Woods, M Flaherty - Open Mind, 2024 - direct.mit.edu
Researchers have recently argued that the capabilities of Large Language Models (LLMs)
can provide new insights into longstanding debates about the role of learning and/or …