Probing classifiers: Promises, shortcomings, and advances
Y Belinkov - Computational Linguistics, 2022 - direct.mit.edu
Probing classifiers have emerged as one of the prominent methodologies for interpreting
and analyzing deep neural network models of natural language processing. The basic idea …
and analyzing deep neural network models of natural language processing. The basic idea …
Analysis methods in neural language processing: A survey
Y Belinkov, J Glass - … of the Association for Computational Linguistics, 2019 - direct.mit.edu
The field of natural language processing has seen impressive progress in recent years, with
neural network models replacing many of the traditional systems. A plethora of new models …
neural network models replacing many of the traditional systems. A plethora of new models …
Explainability for large language models: A survey
Large language models (LLMs) have demonstrated impressive capabilities in natural
language processing. However, their internal mechanisms are still unclear and this lack of …
language processing. However, their internal mechanisms are still unclear and this lack of …
Analyzing multi-head self-attention: Specialized heads do the heavy lifting, the rest can be pruned
Multi-head self-attention is a key component of the Transformer, a state-of-the-art
architecture for neural machine translation. In this work we evaluate the contribution made …
architecture for neural machine translation. In this work we evaluate the contribution made …
Designing and interpreting probes with control tasks
Probes, supervised models trained to predict properties (like parts-of-speech) from
representations (like ELMo), have achieved high accuracy on a range of linguistic tasks. But …
representations (like ELMo), have achieved high accuracy on a range of linguistic tasks. But …
What you can cram into a single vector: Probing sentence embeddings for linguistic properties
Although much effort has recently been devoted to training high-quality sentence
embeddings, we still have a poor understanding of what they are capturing." Downstream" …
embeddings, we still have a poor understanding of what they are capturing." Downstream" …
Compositionality decomposed: How do neural networks generalise?
Despite a multitude of empirical studies, little consensus exists on whether neural networks
are able to generalise compositionally, a controversy that, in part, stems from a lack of …
are able to generalise compositionally, a controversy that, in part, stems from a lack of …
A survey on semantic processing techniques
Semantic processing is a fundamental research domain in computational linguistics. In the
era of powerful pre-trained language models and large language models, the advancement …
era of powerful pre-trained language models and large language models, the advancement …
The geometry of hidden representations of large transformer models
Large transformers are powerful architectures used for self-supervised data analysis across
various data types, including protein sequences, images, and text. In these models, the …
various data types, including protein sequences, images, and text. In these models, the …
The bottom-up evolution of representations in the transformer: A study with machine translation and language modeling objectives
We seek to understand how the representations of individual tokens and the structure of the
learned feature space evolve between layers in deep neural networks under different …
learned feature space evolve between layers in deep neural networks under different …