Towards robust visual question answering: Making the most of biased samples via contrastive learning

Q Si, Y Liu, F Meng, Z Lin, P Fu, Y Cao, W Wang… - arXiv preprint arXiv …, 2022 - arxiv.org
Models for Visual Question Answering (VQA) often rely on the spurious correlations, ie, the
language priors, that appear in the biased samples of training set, which make them brittle …

Learning to contrast the counterfactual samples for robust visual question answering

Z Liang, W Jiang, H Hu, J Zhu - Proceedings of the 2020 …, 2020 - aclanthology.org
In the task of Visual Question Answering (VQA), most state-of-the-art models tend to learn
spurious correlations in the training set and achieve poor performance in out-of-distribution …

A Multi-modal Debiasing Model with Dynamical Constraint for Robust Visual Question Answering

Y Li, B Hu, F Zhang, Y Yu, J Liu… - Findings of the …, 2023 - aclanthology.org
Recent studies have pointed out that many well-developed Visual Question Answering
(VQA) systems suffer from bias problem. Despite the remarkable performance gained on In …

Overcoming language priors for visual question answering via loss rebalancing label and global context

R Cao, Z Li - Uncertainty in Artificial Intelligence, 2023 - proceedings.mlr.press
Despite the advances in Visual Question Answering (VQA), many VQA models currently
suffer from language priors (ie generating answers directly from questions without using …

Overcoming language priors with self-supervised learning for visual question answering

X Zhu, Z Mao, C Liu, P Zhang, B Wang… - arXiv preprint arXiv …, 2020 - arxiv.org
Most Visual Question Answering (VQA) models suffer from the language prior problem,
which is caused by inherent data biases. Specifically, VQA models tend to answer questions …

Debiased visual question answering from feature and sample perspectives

Z Wen, G Xu, M Tan, Q Wu… - Advances in Neural …, 2021 - proceedings.neurips.cc
Visual question answering (VQA) is designed to examine the visual-textual reasoning ability
of an intelligent agent. However, recent observations show that many VQA models may only …

Digging out discrimination information from generated samples for robust visual question answering

Z Wen, Y Wang, M Tan, Q Wu, Q Wu - Findings of the Association …, 2023 - aclanthology.org
Abstract Visual Question Answering (VQA) aims to answer a textual question based on a
given image. Nevertheless, recent studies have shown that VQA models tend to capture the …

Overcoming language priors in visual question answering via distinguishing superficially similar instances

Y Wu, Y Zhao, S Zhao, Y Zhang, X Yuan… - arXiv preprint arXiv …, 2022 - arxiv.org
Despite the great progress of Visual Question Answering (VQA), current VQA models heavily
rely on the superficial correlation between the question type and its corresponding frequent …

Discovering the unknown knowns: Turning implicit knowledge in the dataset into explicit training examples for visual question answering

J Kil, C Zhang, D Xuan, WL Chao - arXiv preprint arXiv:2109.06122, 2021 - arxiv.org
Visual question answering (VQA) is challenging not only because the model has to handle
multi-modal information, but also because it is just so hard to collect sufficient training …

Suppressing biased samples for robust VQA

N Ouyang, Q Huang, P Li, Y Cai, B Liu… - IEEE Transactions …, 2021 - ieeexplore.ieee.org
Most existing visual question answering (VQA) models strongly rely on language bias to
answer questions, ie, they always tend to fit question-answer pairs on the train split and …