Overcoming language priors in vqa via decomposed linguistic representations

C Jing, Y Wu, X Zhang, Y Jia, Q Wu - … of the AAAI conference on artificial …, 2020 - aaai.org
Abstract Most existing Visual Question Answering (VQA) models overly rely on language
priors between questions and answers. In this paper, we present a novel method of …

Be flexible! learn to debias by sampling and prompting for robust visual question answering

J Liu, CF Fan, F Zhou, H Xu - Information Processing & Management, 2023 - Elsevier
Recent studies point out that VQA models tend to rely on the language prior in the training
data to answer the questions, which prevents the VQA model from generalization on the out …

Mix-tower: Light visual question answering framework based on exclusive self-attention mechanism

D Chen, J Chen, L Yang, F Shang - Neurocomputing, 2024 - Elsevier
Visual question answering (VQA) holds the potential to enhance artificial intelligence
proficiency in understanding natural language, stimulate advances in computer vision …

Task-driven visual saliency and attention-based visual question answering

Y Lin, Z Pang, D Wang, Y Zhuang - arXiv preprint arXiv:1702.06700, 2017 - arxiv.org
Visual question answering (VQA) has witnessed great progress since May, 2015 as a
classic problem unifying visual and textual data into a system. Many enlightening VQA works …

Visual question answering with attention transfer and a cross-modal gating mechanism

W Li, J Sun, G Liu, L Zhao, X Fang - Pattern Recognition Letters, 2020 - Elsevier
Visual question answering (VQA) is challenging since it requires to understand both
language information and corresponding visual contents. A lot of efforts have been made to …

Generative bias for robust visual question answering

JW Cho, DJ Kim, H Ryu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Abstract The task of Visual Question Answering (VQA) is known to be plagued by the issue
of VQA models exploiting biases within the dataset to make its final prediction. Various …

Adversarial regularization for visual question answering: Strengths, shortcomings, and side effects

G Grand, Y Belinkov - arXiv preprint arXiv:1906.08430, 2019 - arxiv.org
Visual question answering (VQA) models have been shown to over-rely on linguistic biases
in VQA datasets, answering questions" blindly" without considering visual context …

Cycle-consistency for robust visual question answering

M Shah, X Chen, M Rohrbach… - Proceedings of the …, 2019 - openaccess.thecvf.com
Despite significant progress in Visual Question Answer-ing over the years, robustness of
today's VQA models leave much to be desired. We introduce a new evaluation protocol and …

Visual question answering

A Nada, M Chen - 2024 International Conference on …, 2024 - ieeexplore.ieee.org
Visual question answering (VQA) is an artificial intelligence (AI) and computer vision (CV)
comprehensive task to answer questions about the visual content of an image, such as …

Contrast and classify: Training robust vqa models

Y Kant, A Moudgil, D Batra, D Parikh… - Proceedings of the …, 2021 - openaccess.thecvf.com
Abstract Recent Visual Question Answering (VQA) models have shown impressive
performance on the VQA benchmark but remain sensitive to small linguistic variations in …