Shortcomings of top-down randomization-based sanity checks for evaluations of deep neural network explanations

A Binder, L Weber, S Lapuschkin… - Proceedings of the …, 2023 - openaccess.thecvf.com
While the evaluation of explanations is an important step towards trustworthy models, it
needs to be done carefully, and the employed metrics need to be well-understood …

A Fresh Look at Sanity Checks for Saliency Maps

A Hedström, L Weber, S Lapuschkin… - World Conference on …, 2024 - Springer
Abstract The Model Parameter Randomisation Test (MPRT) is highly recognised in the
eXplainable Artificial Intelligence (XAI) community due to its fundamental evaluative …

Thermostat: A large collection of NLP model explanations and analysis tools

N Feldhus, R Schwarzenberg, S Möller - arXiv preprint arXiv:2108.13961, 2021 - arxiv.org
In the language domain, as in other domains, neural explainability takes an ever more
important role, with feature attribution methods on the forefront. Many such methods require …

Black-box language model explanation by context length probing

O Cífka, A Liutkus - arXiv preprint arXiv:2212.14815, 2022 - arxiv.org
The increasingly widespread adoption of large language models has highlighted the need
for improving their explainability. We present context length probing, a novel explanation …

Sanity checks revisited: An exploration to repair the model parameter randomisation test

A Hedström, L Weber, S Lapuschkin… - arXiv preprint arXiv …, 2024 - arxiv.org
The Model Parameter Randomisation Test (MPRT) is widely acknowledged in the
eXplainable Artificial Intelligence (XAI) community for its well-motivated evaluative principle …

Endoscopy-based IBD identification by a quantized deep learning pipeline

M Datres, E Paolazzi, M Chierici, M Pozzi, A Colangelo… - BioData Mining, 2023 - Springer
Background Discrimination between patients affected by inflammatory bowel diseases and
healthy controls on the basis of endoscopic imaging is an challenging problem for machine …

ReAGent: Towards A Model-agnostic Feature Attribution Method for Generative Language Models

Z Zhao, B Shan - arXiv preprint arXiv:2402.00794, 2024 - arxiv.org
Feature attribution methods (FAs), such as gradients and attention, are widely employed
approaches to derive the importance of all input features to the model predictions. Existing …

Occlusion Sensitivity Analysis with Augmentation Subspace Perturbation in Deep Feature Space

PHV Valois, K Niinuma, K Fukui - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Deep Learning of neural networks has gained prominence in multiple life-critical
applications like medical diagnoses and autonomous vehicle accident investigations …

1Department of Computer Science, University of Sheffield

Z Zhao, B Shan - 2024 - advance.sagepub.com
Feature attribution methods (FAs), such as gradients and attention, are widely employed
approaches to derive the importance of all input features to the model predictions. Existing …

The Meta-Evaluation Problem in Explainable AI: Identifying Reliable Estimators with MetaQuantus

KK Wickstrøm, MMC Höhne - 2023 - munin.uit.no
Explainable AI (XAI) is a rapidly evolving field that aims to improve transparency and
trustworthiness of AI systems to humans. One of the unsolved challenges in XAI is estimating …