Explainable and interpretable multimodal large language models: A comprehensive survey

Y Dang, K Huang, J Huo, Y Yan, S Huang, D Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
The rapid development of Artificial Intelligence (AI) has revolutionized numerous fields, with
large language models (LLMs) and computer vision (CV) systems driving advancements in …

Understanding the (extra-) ordinary: Validating deep model decisions with prototypical concept-based explanations

M Dreyer, R Achtibat, W Samek… - Proceedings of the …, 2024 - openaccess.thecvf.com
Ensuring both transparency and safety is critical when deploying Deep Neural Networks
(DNNs) in high-risk applications such as medicine. The field of explainable AI (XAI) has …

Discover-then-name: Task-agnostic concept bottlenecks via automated concept discovery

S Rao, S Mahajan, M Böhle, B Schiele - European Conference on …, 2024 - Springer
Abstract Concept Bottleneck Models (CBMs) have recently been proposed to address the
'black-box'problem of deep neural networks, by first mapping images to a human …

On the foundations of shortcut learning

KL Hermann, H Mobahi, T Fel, MC Mozer - arXiv preprint arXiv …, 2023 - arxiv.org
Deep-learning models can extract a rich assortment of features from data. Which features a
model uses depends not only on predictivity-how reliably a feature indicates train-set labels …

Understanding Video Transformers via Universal Concept Discovery

M Kowal, A Dave, R Ambrus… - Proceedings of the …, 2024 - openaccess.thecvf.com
This paper studies the problem of concept-based interpretability of transformer
representations for videos. Concretely we seek to explain the decision-making process of …

Concept-Based Explanations in Computer Vision: Where Are We and Where Could We Go?

JH Lee, G Mikriukov, G Schwalbe, S Wermter… - arXiv preprint arXiv …, 2024 - arxiv.org
Concept-based XAI (C-XAI) approaches to explaining neural vision models are a promising
field of research, since explanations that refer to concepts (ie, semantically meaningful parts …

Interpreting clip with sparse linear concept embeddings (splice)

U Bhalla, A Oesterling, S Srinivas, FP Calmon… - arXiv preprint arXiv …, 2024 - arxiv.org
CLIP embeddings have demonstrated remarkable performance across a wide range of
computer vision tasks. However, these high-dimensional, dense vector representations are …

Interpretability is in the mind of the beholder: A causal framework for human-interpretable representation learning

E Marconato, A Passerini, S Teso - Entropy, 2023 - mdpi.com
Research on Explainable Artificial Intelligence has recently started exploring the idea of
producing explanations that, rather than being expressed in terms of low-level features, are …

Reactive Model Correction: Mitigating Harm to Task-Relevant Features via Conditional Bias Suppression

D Bareeva, M Dreyer, F Pahde… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Deep Neural Networks are prone to learning and relying on spurious correlations in
the training data which for high-risk applications can have fatal consequences. Various …

Beyond Scalars: Concept-Based Alignment Analysis in Vision Transformers

J Vielhaben, D Bareeva, J Berend, W Samek… - arXiv preprint arXiv …, 2024 - arxiv.org
Vision transformers (ViTs) can be trained using various learning paradigms, from fully
supervised to self-supervised. Diverse training protocols often result in significantly different …