Label-free concept bottleneck models
Concept bottleneck models (CBM) are a popular way of creating more interpretable neural
networks by having hidden layer neurons correspond to human-understandable concepts …
networks by having hidden layer neurons correspond to human-understandable concepts …
Text-to-concept (and back) via cross-model alignment
We observe that the mapping between an image's representation in one model to its
representation in another can be learned surprisingly well with just a linear layer, even …
representation in another can be learned surprisingly well with just a linear layer, even …
Labeling neural representations with inverse recognition
Abstract Deep Neural Networks (DNNs) demonstrated remarkable capabilities in learning
complex hierarchical data representations, but the nature of these representations remains …
complex hierarchical data representations, but the nature of these representations remains …
Find: A function description benchmark for evaluating interpretability methods
S Schwettmann, T Shaham… - Advances in …, 2024 - proceedings.neurips.cc
Labeling neural network submodules with human-legible descriptions is useful for many
downstream tasks: such descriptions can surface failures, guide interventions, and perhaps …
downstream tasks: such descriptions can surface failures, guide interventions, and perhaps …
Faithful vision-language interpretation via concept bottleneck models
The demand for transparency in healthcare and finance has led to interpretable machine
learning (IML) models, notably the concept bottleneck models (CBMs), valued for their …
learning (IML) models, notably the concept bottleneck models (CBMs), valued for their …
Towards a fuller understanding of neurons with clustered compositional explanations
Compositional Explanations is a method for identifying logical formulas of concepts that
approximate the neurons' behavior. However, these explanations are linked to the small …
approximate the neurons' behavior. However, these explanations are linked to the small …
Concept-based explainable artificial intelligence: A survey
The field of explainable artificial intelligence emerged in response to the growing need for
more transparent and reliable models. However, using raw features to provide explanations …
more transparent and reliable models. However, using raw features to provide explanations …
Information maximization perspective of orthogonal matching pursuit with applications to explainable ai
A Chattopadhyay, R Pilgrim… - Advances in Neural …, 2024 - proceedings.neurips.cc
Abstract Information Pursuit (IP) is a classical active testing algorithm for predicting an output
by sequentially and greedily querying the input in order of information gain. However, IP is …
by sequentially and greedily querying the input in order of information gain. However, IP is …
Decomposing and editing predictions by modeling model computation
How does the internal computation of a machine learning model transform inputs into
predictions? In this paper, we introduce a task called component modeling that aims to …
predictions? In this paper, we introduce a task called component modeling that aims to …
Text2concept: Concept activation vectors directly from text
Abstract Concept activation vectors (CAVs) enable interpretability of a model with respect to
human concepts, though CAV generation requires the costly step of curating positive and …
human concepts, though CAV generation requires the costly step of curating positive and …