Interpretable deep learning: Interpretation, interpretability, trustworthiness, and beyond

X Li, H Xiong, X Li, X Wu, X Zhang, J Liu, J Bian… - … and Information Systems, 2022 - Springer
Deep neural networks have been well-known for their superb handling of various machine
learning and artificial intelligence tasks. However, due to their over-parameterized black-box …

Analysis methods in neural language processing: A survey

Y Belinkov, J Glass - … of the Association for Computational Linguistics, 2019 - direct.mit.edu
The field of natural language processing has seen impressive progress in recent years, with
neural network models replacing many of the traditional systems. A plethora of new models …

A survey of the state of explainable AI for natural language processing

M Danilevsky, K Qian, R Aharonov, Y Katsis… - arXiv preprint arXiv …, 2020 - arxiv.org
Recent years have seen important advances in the quality of state-of-the-art models, but this
has come at the expense of models becoming less interpretable. This survey presents an …

Does the whole exceed its parts? the effect of ai explanations on complementary team performance

G Bansal, T Wu, J Zhou, R Fok, B Nushi… - Proceedings of the …, 2021 - dl.acm.org
Many researchers motivate explainable AI with studies showing that human-AI team
performance on decision-making tasks improves when the AI explains its recommendations …

Towards faithfully interpretable NLP systems: How should we define and evaluate faithfulness?

A Jacovi, Y Goldberg - arXiv preprint arXiv:2004.03685, 2020 - arxiv.org
With the growing popularity of deep-learning based NLP models, comes a need for
interpretable systems. But what is interpretability, and what constitutes a high-quality …

Attention is not explanation

S Jain, BC Wallace - arXiv preprint arXiv:1902.10186, 2019 - arxiv.org
Attention mechanisms have seen wide adoption in neural NLP models. In addition to
improving predictive performance, these are often touted as affording transparency: models …

Is attention interpretable?

S Serrano, NA Smith - arXiv preprint arXiv:1906.03731, 2019 - arxiv.org
Attention mechanisms have recently boosted performance on a range of NLP tasks.
Because attention layers explicitly weight input components' representations, it is also often …

An empirical study of spatial attention mechanisms in deep networks

X Zhu, D Cheng, Z Zhang, S Lin… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
Attention mechanisms have become a popular component in deep neural networks, yet
there has been little examination of how different influencing factors and methods for …

Self-attention attribution: Interpreting information interactions inside transformer

Y Hao, L Dong, F Wei, K Xu - Proceedings of the AAAI Conference on …, 2021 - ojs.aaai.org
The great success of Transformer-based models benefits from the powerful multi-head self-
attention mechanism, which learns token dependencies and encodes contextual information …

Attention interpretability across nlp tasks

S Vashishth, S Upadhyay, GS Tomar… - arXiv preprint arXiv …, 2019 - arxiv.org
The attention layer in a neural network model provides insights into the model's reasoning
behind its prediction, which are usually criticized for being opaque. Recently, seemingly …