Interpretable and explainable machine learning: a methods‐centric overview with concrete examples

R Marcinkevičs, JE Vogt - Wiley Interdisciplinary Reviews: Data …, 2023 - Wiley Online Library
Interpretability and explainability are crucial for machine learning (ML) and statistical
applications in medicine, economics, law, and natural sciences and form an essential …

From" where" to" what": Towards human-understandable explanations through concept relevance propagation

R Achtibat, M Dreyer, I Eisenbraun, S Bosse… - arXiv preprint arXiv …, 2022 - arxiv.org
The emerging field of eXplainable Artificial Intelligence (XAI) aims to bring transparency to
today's powerful but opaque deep learning models. While local XAI methods explain …

Revealing hidden context bias in segmentation and object detection through concept-specific explanations

M Dreyer, R Achtibat, T Wiegand… - Proceedings of the …, 2023 - openaccess.thecvf.com
Applying traditional post-hoc attribution methods to segmentation or object detection
predictors offers only limited insights, as the obtained feature attribution maps at input level …

Beyond model interpretability: socio-structural explanations in machine learning

A Smart, A Kasirzadeh - AI & SOCIETY, 2024 - Springer
What is it to interpret the outputs of an opaque machine learning model? One approach is to
develop interpretable machine learning techniques. These techniques aim to show how …

Human-centered concept explanations for neural networks

CK Yeh, B Kim, P Ravikumar - … intelligence: The state of the art, 2021 - ebooks.iospress.nl
Understanding complex machine learning models such as deep neural networks with
explanations is crucial in various applications. Many explanations stem from the model …

How machines could teach physicists new scientific concepts

I Georgescu - Nature Reviews Physics, 2022 - nature.com
How machines could teach physicists new scientific concepts | Nature Reviews Physics Skip to
main content Thank you for visiting nature.com. You are using a browser version with limited …

Concept distillation: leveraging human-centered explanations for model improvement

A Gupta, S Saini, PJ Narayanan - Advances in Neural …, 2024 - proceedings.neurips.cc
Humans use abstract concepts for understanding instead of hard features. Recent
interpretability research has focused on human-centered concept explanations of neural …

Concept gradient: Concept-based interpretation without linear assumption

A Bai, CK Yeh, P Ravikumar, NYC Lin… - arXiv preprint arXiv …, 2022 - arxiv.org
Concept-based interpretations of black-box models are often more intuitive for humans to
understand. The most widely adopted approach for concept-based interpretation is Concept …

You are my type! type embeddings for pre-trained language models

M Saeed, P Papotti - Findings of the Association for …, 2022 - aclanthology.org
One reason for the positive impact of Pre-trained Language Models (PLMs) in NLP tasks is
their ability to encode semantic types, such as 'European City'or 'Woman'. While previous …

CAVLI-Using image associations to produce local concept-based explanations

P Shukla, S Bharati, M Turk - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
While explainability is becoming increasingly crucial in computer vision and machine
learning, producing explanations that are able to link decisions made by deep neural …