" Will You Find These Shortcuts?" A Protocol for Evaluating the Faithfulness of Input Salience...

A Madsen, S Reddy, S Chandar - ACM Computing Surveys, 2022 - dl.acm.org

Neural networks for NLP are becoming increasingly complex and widespread, and there is a
growing concern if these models are responsible to use. Explaining models helps to address …

被引用次数：232 相关文章所有 5 个版本

[PDF] acm.org

Explainability for large language models: A survey

H Zhao, H Chen, F Yang, N Liu, H Deng, H Cai… - ACM Transactions on …, 2024 - dl.acm.org

Large language models (LLMs) have demonstrated impressive capabilities in natural
language processing. However, their internal mechanisms are still unclear and this lack of …

被引用次数：284 相关文章所有 5 个版本

[PDF] neurips.cc

Protein design with guided discrete diffusion

N Gruver, S Stanton, N Frey… - Advances in neural …, 2024 - proceedings.neurips.cc

A popular approach to protein design is to combine a generative model with a discriminative
model for conditional sampling. The generative model samples plausible sequences while …

被引用次数：79 相关文章所有 6 个版本

[PDF] openreview.net

Toward transparent ai: A survey on interpreting the inner structures of deep neural networks

T Räuker, A Ho, S Casper… - 2023 ieee conference …, 2023 - ieeexplore.ieee.org

The last decade of machine learning has seen drastic increases in scale and capabilities.
Deep neural networks (DNNs) are increasingly being deployed in the real world. However …

被引用次数：161 相关文章所有 5 个版本

[PDF] mit.edu

Towards faithful model explanation in nlp: A survey

Q Lyu, M Apidianaki, C Callison-Burch - Computational Linguistics, 2024 - direct.mit.edu

End-to-end neural Natural Language Processing (NLP) models are notoriously difficult to
understand. This has given rise to numerous efforts towards model explainability in recent …

被引用次数：79 相关文章所有 4 个版本

[PDF] arxiv.org

Rethinking interpretability in the era of large language models

C Singh, JP Inala, M Galley, R Caruana… - arXiv preprint arXiv …, 2024 - arxiv.org

Interpretable machine learning has exploded as an area of interest over the last decade,
sparked by the rise of increasingly large datasets and deep neural networks …

被引用次数：52 相关文章所有 2 个版本

[PDF] arxiv.org

Inseq: An interpretability toolkit for sequence generation models

G Sarti, N Feldhus, L Sickert, O Van Der Wal… - arXiv preprint arXiv …, 2023 - arxiv.org

Past work in natural language processing interpretability focused mainly on popular
classification tasks while largely overlooking generation settings, partly due to a lack of …

被引用次数：52 相关文章所有 9 个版本

[PDF] springer.com

Knowledge mining: A cross-disciplinary survey

Y Rui, VIS Carmona, M Pourvali, Y Xing, WW Yi… - Machine Intelligence …, 2022 - Springer

Abstract Knowledge mining is a widely active research area across disciplines such as
natural language processing (NLP), data mining (DM), and machine learning (ML). The …

被引用次数：15 相关文章所有 5 个版本

[PDF] arxiv.org

Explainable information retrieval: A survey

A Anand, L Lyu, M Idahl, Y Wang, J Wallat… - arXiv preprint arXiv …, 2022 - arxiv.org

Explainable information retrieval is an emerging research area aiming to make transparent
and trustworthy information retrieval systems. Given the increasing use of complex machine …

被引用次数：31 相关文章所有 3 个版本

[PDF] arxiv.org

Explaining how transformers use context to build predictions

J Ferrando, GI Gállego, I Tsiamas… - arXiv preprint arXiv …, 2023 - arxiv.org

Language Generation Models produce words based on the previous context. Although
existing methods offer input attributions as explanations for a model's prediction, it is still …

被引用次数：21 相关文章所有 8 个版本