Verb conjugation in transformers is determined by linear encodings of subject number- 学术资源搜索

文章

学术资源搜索

Verb conjugation in transformers is determined by linear encodings of subject number

S Hao, T Linzen - arXiv preprint arXiv:2310.15151, 2023 - arxiv.org

arXiv preprint arXiv:2310.15151, 2023•arxiv.org

Deep architectures such as Transformers are sometimes criticized for having
uninterpretable" black-box" representations. We use causal intervention analysis to show
that, in fact, some linguistic features are represented in a linear, interpretable format.
Specifically, we show that BERT's ability to conjugate verbs relies on a linear encoding of
subject number that can be manipulated with predictable effects on conjugation accuracy.
This encoding is found in the subject position at the first layer and the verb position at the …

Deep architectures such as Transformers are sometimes criticized for having uninterpretable "black-box" representations. We use causal intervention analysis to show that, in fact, some linguistic features are represented in a linear, interpretable format. Specifically, we show that BERT's ability to conjugate verbs relies on a linear encoding of subject number that can be manipulated with predictable effects on conjugation accuracy. This encoding is found in the subject position at the first layer and the verb position at the last layer, but distributed across positions at middle layers, particularly when there are multiple cues to subject number.

arxiv.org

展开收起

被引用次数：5 相关文章所有 6 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果