HCCL: H ierarchical C ounterfactual C ontrastive L earning for Robust Visual Question Answering

D Hao, Q Wang, X Zhu, J Liu - ACM Transactions on Multimedia Computing … - dl.acm.org
Despite most state-of-the-art models having achieved amazing performance in visual
question answering (VQA), they usually utilize biases to answer the question. Recently …

Learning to contrast the counterfactual samples for robust visual question answering

Z Liang, W Jiang, H Hu, J Zhu - Proceedings of the 2020 …, 2020 - aclanthology.org
In the task of Visual Question Answering (VQA), most state-of-the-art models tend to learn
spurious correlations in the training set and achieve poor performance in out-of-distribution …

Counterfactual samples synthesizing for robust visual question answering

L Chen, X Yan, J Xiao, H Zhang… - Proceedings of the …, 2020 - openaccess.thecvf.com
Abstract Despite Visual Question Answering (VQA) has realized impressive progress over
the last few years, today's VQA models tend to capture superficial linguistic correlations in …

Overcoming language priors with self-supervised learning for visual question answering

X Zhu, Z Mao, C Liu, P Zhang, B Wang… - arXiv preprint arXiv …, 2020 - arxiv.org
Most Visual Question Answering (VQA) models suffer from the language prior problem,
which is caused by inherent data biases. Specifically, VQA models tend to answer questions …

Counterfactual samples synthesizing and training for robust visual question answering

L Chen, Y Zheng, Y Niu, H Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Today's VQA models still tend to capture superficial linguistic correlations in the training set
and fail to generalize to the test set with different QA distributions. To reduce these language …

Distilling knowledge in causal inference for unbiased visual question answering

Y Pan, Z Li, L Zhang, J Tang - Proceedings of the 2nd ACM International …, 2021 - dl.acm.org
Current Visual Question Answering (VQA) models mainly explore the statistical correlations
between answers and questions, which fail to capture the relationship between the visual …

Overcoming language priors for visual question answering via loss rebalancing label and global context

R Cao, Z Li - Uncertainty in Artificial Intelligence, 2023 - proceedings.mlr.press
Despite the advances in Visual Question Answering (VQA), many VQA models currently
suffer from language priors (ie generating answers directly from questions without using …

Towards robust visual question answering: Making the most of biased samples via contrastive learning

Q Si, Y Liu, F Meng, Z Lin, P Fu, Y Cao, W Wang… - arXiv preprint arXiv …, 2022 - arxiv.org
Models for Visual Question Answering (VQA) often rely on the spurious correlations, ie, the
language priors, that appear in the biased samples of training set, which make them brittle …

Unbiased Visual Question Answering by Leveraging Instrumental Variable

Y Pan, J Liu, L Jin, Z Li - IEEE Transactions on Multimedia, 2024 - ieeexplore.ieee.org
Existing unbiased visual question answering (VQA) models reduce the spurious correlation
between questions and answers to force the models to focus on visual information …

Fair Attention Network for Robust Visual Question Answering

Y Bi, H Jiang, Y Hu, Y Sun, B Yin - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
As a prevailing cross-modal reasoning task, Visual Question Answering (VQA) has achieved
impressive progress in the last few years, where the language bias is widely studied to learn …