A primer in BERTology: What we know about how BERT works

A Rogers, O Kovaleva, A Rumshisky - Transactions of the Association …, 2021 - direct.mit.edu
Transformer-based models have pushed state of the art in many areas of NLP, but our
understanding of what is behind their success is still limited. This paper is the first survey of …

Transformer grammars: Augmenting transformer language models with syntactic inductive biases at scale

L Sartran, S Barrett, A Kuncoro, M Stanojević… - Transactions of the …, 2022 - direct.mit.edu
Abstract We introduce Transformer Grammars (TGs), a novel class of Transformer language
models that combine (i) the expressive power, scalability, and strong performance of …

Automatic creation of acceptance tests by extracting conditionals from requirements: NLP approach and case study

J Fischbach, J Frattini, A Vogelsang, D Mendez… - Journal of Systems and …, 2023 - Elsevier
Acceptance testing is crucial to determine whether a system fulfills end-user requirements.
However, the creation of acceptance tests is a laborious task entailing two major …

Dual attention graph convolutional network for relation extraction

D Zhang, Z Liu, W Jia, F Wu, H Liu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Dependency-based models are widely used to extract semantic relations in text. Most
existing dependency-based models establish stacked structures to merge contextual and …

Encoding syntactic knowledge in transformer encoder for intent detection and slot filling

J Wang, K Wei, M Radfar, W Zhang… - Proceedings of the AAAI …, 2021 - ojs.aaai.org
We propose a novel Transformer encoder-based architecture with syntactical knowledge
encoded for intent detection and slot filling. Specifically, we encode syntactic knowledge into …

[PDF][PDF] Language of AI

D Bylieva - Technology and Language, 2022 - coeckelbergh.net
In the modern world human-robot relations, language plays a significant role. One used to
view language as a purely human technology, but today language is being mastered by non …

A review on main optimization methods of BERT

L Huan, Z Zhixiong, W Yufei - Data Analysis and …, 2021 - manu44.magtech.com.cn
[Objective] This paper analyzes and summarizes the main optimization methods of the BERT
language representation model released by Google to provide reference for future studies …

BERTology for machine translation: What BERT knows about linguistic difficulties for translation

Y Dai, M de Kamps, S Sharoff - Proceedings of the thirteenth …, 2022 - aclanthology.org
Pre-trained transformer-based models, such as BERT, have shown excellent performance in
most natural language processing benchmark tests, but we still lack a good understanding …

[HTML][HTML] Causality in requirements artifacts: prevalence, detection, and impact

J Frattini, J Fischbach, D Mendez… - Requirements …, 2023 - Springer
Causal relations in natural language (NL) requirements convey strong, semantic
information. Automatically extracting such causal information enables multiple use cases …

Syntactic structure distillation pretraining for bidirectional encoders

A Kuncoro, L Kong, D Fried, D Yogatama… - Transactions of the …, 2020 - direct.mit.edu
Textual representation learners trained on large amounts of data have achieved notable
success on downstream tasks; intriguingly, they have also performed well on challenging …