Post-hoc interpretability for neural nlp: A survey

A Madsen, S Reddy, S Chandar - ACM Computing Surveys, 2022 - dl.acm.org
Neural networks for NLP are becoming increasingly complex and widespread, and there is a
growing concern if these models are responsible to use. Explaining models helps to address …

Explainability for large language models: A survey

H Zhao, H Chen, F Yang, N Liu, H Deng, H Cai… - ACM Transactions on …, 2024 - dl.acm.org
Large language models (LLMs) have demonstrated impressive capabilities in natural
language processing. However, their internal mechanisms are still unclear and this lack of …

Protein design with guided discrete diffusion

N Gruver, S Stanton, N Frey… - Advances in neural …, 2024 - proceedings.neurips.cc
A popular approach to protein design is to combine a generative model with a discriminative
model for conditional sampling. The generative model samples plausible sequences while …

Toward transparent ai: A survey on interpreting the inner structures of deep neural networks

T Räuker, A Ho, S Casper… - 2023 ieee conference …, 2023 - ieeexplore.ieee.org
The last decade of machine learning has seen drastic increases in scale and capabilities.
Deep neural networks (DNNs) are increasingly being deployed in the real world. However …

Towards faithful model explanation in nlp: A survey

Q Lyu, M Apidianaki, C Callison-Burch - Computational Linguistics, 2024 - direct.mit.edu
End-to-end neural Natural Language Processing (NLP) models are notoriously difficult to
understand. This has given rise to numerous efforts towards model explainability in recent …

Rethinking interpretability in the era of large language models

C Singh, JP Inala, M Galley, R Caruana… - arXiv preprint arXiv …, 2024 - arxiv.org
Interpretable machine learning has exploded as an area of interest over the last decade,
sparked by the rise of increasingly large datasets and deep neural networks …

Inseq: An interpretability toolkit for sequence generation models

G Sarti, N Feldhus, L Sickert, O Van Der Wal… - arXiv preprint arXiv …, 2023 - arxiv.org
Past work in natural language processing interpretability focused mainly on popular
classification tasks while largely overlooking generation settings, partly due to a lack of …

Knowledge mining: A cross-disciplinary survey

Y Rui, VIS Carmona, M Pourvali, Y Xing, WW Yi… - Machine Intelligence …, 2022 - Springer
Abstract Knowledge mining is a widely active research area across disciplines such as
natural language processing (NLP), data mining (DM), and machine learning (ML). The …

Explainable information retrieval: A survey

A Anand, L Lyu, M Idahl, Y Wang, J Wallat… - arXiv preprint arXiv …, 2022 - arxiv.org
Explainable information retrieval is an emerging research area aiming to make transparent
and trustworthy information retrieval systems. Given the increasing use of complex machine …

Explaining how transformers use context to build predictions

J Ferrando, GI Gállego, I Tsiamas… - arXiv preprint arXiv …, 2023 - arxiv.org
Language Generation Models produce words based on the previous context. Although
existing methods offer input attributions as explanations for a model's prediction, it is still …