Towards robust visual question answering: Making the most of biased samples via contrastive learning
Models for Visual Question Answering (VQA) often rely on the spurious correlations, ie, the
language priors, that appear in the biased samples of training set, which make them brittle …
language priors, that appear in the biased samples of training set, which make them brittle …
Learning to contrast the counterfactual samples for robust visual question answering
In the task of Visual Question Answering (VQA), most state-of-the-art models tend to learn
spurious correlations in the training set and achieve poor performance in out-of-distribution …
spurious correlations in the training set and achieve poor performance in out-of-distribution …
A Multi-modal Debiasing Model with Dynamical Constraint for Robust Visual Question Answering
Recent studies have pointed out that many well-developed Visual Question Answering
(VQA) systems suffer from bias problem. Despite the remarkable performance gained on In …
(VQA) systems suffer from bias problem. Despite the remarkable performance gained on In …
Overcoming language priors for visual question answering via loss rebalancing label and global context
R Cao, Z Li - Uncertainty in Artificial Intelligence, 2023 - proceedings.mlr.press
Despite the advances in Visual Question Answering (VQA), many VQA models currently
suffer from language priors (ie generating answers directly from questions without using …
suffer from language priors (ie generating answers directly from questions without using …
Overcoming language priors with self-supervised learning for visual question answering
Most Visual Question Answering (VQA) models suffer from the language prior problem,
which is caused by inherent data biases. Specifically, VQA models tend to answer questions …
which is caused by inherent data biases. Specifically, VQA models tend to answer questions …
Debiased visual question answering from feature and sample perspectives
Visual question answering (VQA) is designed to examine the visual-textual reasoning ability
of an intelligent agent. However, recent observations show that many VQA models may only …
of an intelligent agent. However, recent observations show that many VQA models may only …
Digging out discrimination information from generated samples for robust visual question answering
Abstract Visual Question Answering (VQA) aims to answer a textual question based on a
given image. Nevertheless, recent studies have shown that VQA models tend to capture the …
given image. Nevertheless, recent studies have shown that VQA models tend to capture the …
Overcoming language priors in visual question answering via distinguishing superficially similar instances
Despite the great progress of Visual Question Answering (VQA), current VQA models heavily
rely on the superficial correlation between the question type and its corresponding frequent …
rely on the superficial correlation between the question type and its corresponding frequent …
Discovering the unknown knowns: Turning implicit knowledge in the dataset into explicit training examples for visual question answering
Visual question answering (VQA) is challenging not only because the model has to handle
multi-modal information, but also because it is just so hard to collect sufficient training …
multi-modal information, but also because it is just so hard to collect sufficient training …
Suppressing biased samples for robust VQA
Most existing visual question answering (VQA) models strongly rely on language bias to
answer questions, ie, they always tend to fit question-answer pairs on the train split and …
answer questions, ie, they always tend to fit question-answer pairs on the train split and …