Overcoming language priors for visual question answering via loss rebalancing label and global context

R Cao, Z Li - Uncertainty in Artificial Intelligence, 2023 - proceedings.mlr.press
Despite the advances in Visual Question Answering (VQA), many VQA models currently
suffer from language priors (ie generating answers directly from questions without using …

Overcoming language priors with self-supervised learning for visual question answering

X Zhu, Z Mao, C Liu, P Zhang, B Wang… - arXiv preprint arXiv …, 2020 - arxiv.org
Most Visual Question Answering (VQA) models suffer from the language prior problem,
which is caused by inherent data biases. Specifically, VQA models tend to answer questions …

Cross Modality Bias in Visual Question Answering: A Causal View with Possible Worlds VQA

A Vosoughi, S Deng, S Zhang, Y Tian… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
To increase the generalization capability of VQA systems, many recent studies have tried to
de-bias spurious language or vision associations that shortcut the question or image to the …

Overcoming language priors via shuffling language bias for robust visual question answering

J Zhao, Z Yu, X Zhang, Y Yang - IEEE Access, 2023 - ieeexplore.ieee.org
Recent research has revealed the notorious language prior problem in visual question
answering (VQA) tasks based on visual-textual interaction, which indicates that well …

[HTML][HTML] Robust visual question answering via semantic cross modal augmentation

A Mashrur, W Luo, NA Zaidi, A Robles-Kelly - Computer Vision and Image …, 2024 - Elsevier
Recent advances in vision-language models have resulted in improved accuracy in visual
question answering (VQA) tasks. However, their robustness remains limited when faced with …

Multi-stage reasoning on introspecting and revising bias for visual question answering

AA Liu, Z Lu, N Xu, M Liu, C Yan, B Zheng… - ACM Transactions on …, 2023 - dl.acm.org
Visual Question Answering (VQA) is a task that involves predicting an answer to a question
depending on the content of an image. However, recent VQA methods have relied more on …

Towards robust visual question answering: Making the most of biased samples via contrastive learning

Q Si, Y Liu, F Meng, Z Lin, P Fu, Y Cao, W Wang… - arXiv preprint arXiv …, 2022 - arxiv.org
Models for Visual Question Answering (VQA) often rely on the spurious correlations, ie, the
language priors, that appear in the biased samples of training set, which make them brittle …

Vqa-bc: Robust visual question answering via bidirectional chaining

M Lao, Y Guo, W Chen, N Pu… - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
Current VQA models are suffering from the problem of overdependence on language bias,
which severely reduces their robustness in real-world scenarios. In this paper, we analyze …

Suppressing biased samples for robust VQA

N Ouyang, Q Huang, P Li, Y Cai, B Liu… - IEEE Transactions …, 2021 - ieeexplore.ieee.org
Most existing visual question answering (VQA) models strongly rely on language bias to
answer questions, ie, they always tend to fit question-answer pairs on the train split and …

Debiased Visual Question Answering via the perspective of question types

T Huai, S Yang, J Zhang, J Zhao, L He - Pattern Recognition Letters, 2024 - Elsevier
Abstract Visual Question Answering (VQA) aims to answer questions according to the given
image. However, current VQA models tend to rely solely on textual information from the …