相关文章- 学术资源搜索

Overcoming language priors in vqa via decomposed linguistic representations

C Jing, Y Wu, X Zhang, Y Jia, Q Wu - … of the AAAI conference on artificial …, 2020 - aaai.org

Abstract Most existing Visual Question Answering (VQA) models overly rely on language
priors between questions and answers. In this paper, we present a novel method of …

被引用次数：88 相关文章所有 8 个版本

Be flexible! learn to debias by sampling and prompting for robust visual question answering

J Liu, CF Fan, F Zhou, H Xu - Information Processing & Management, 2023 - Elsevier

Recent studies point out that VQA models tend to rely on the language prior in the training
data to answer the questions, which prevents the VQA model from generalization on the out …

被引用次数：8 相关文章所有 3 个版本

Mix-tower: Light visual question answering framework based on exclusive self-attention mechanism

D Chen, J Chen, L Yang, F Shang - Neurocomputing, 2024 - Elsevier

Visual question answering (VQA) holds the potential to enhance artificial intelligence
proficiency in understanding natural language, stimulate advances in computer vision …

[PDF] arxiv.org

Task-driven visual saliency and attention-based visual question answering

Y Lin, Z Pang, D Wang, Y Zhuang - arXiv preprint arXiv:1702.06700, 2017 - arxiv.org

Visual question answering (VQA) has witnessed great progress since May, 2015 as a
classic problem unifying visual and textual data into a system. Many enlightening VQA works …

被引用次数：38 相关文章所有 3 个版本

Visual question answering with attention transfer and a cross-modal gating mechanism

W Li, J Sun, G Liu, L Zhao, X Fang - Pattern Recognition Letters, 2020 - Elsevier

Visual question answering (VQA) is challenging since it requires to understand both
language information and corresponding visual contents. A lot of efforts have been made to …

被引用次数：17 相关文章所有 2 个版本

[PDF] thecvf.com

Generative bias for robust visual question answering

JW Cho, DJ Kim, H Ryu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Abstract The task of Visual Question Answering (VQA) is known to be plagued by the issue
of VQA models exploiting biases within the dataset to make its final prediction. Various …

被引用次数：19 相关文章所有 10 个版本

[PDF] arxiv.org

Adversarial regularization for visual question answering: Strengths, shortcomings, and side effects

G Grand, Y Belinkov - arXiv preprint arXiv:1906.08430, 2019 - arxiv.org

Visual question answering (VQA) models have been shown to over-rely on linguistic biases
in VQA datasets, answering questions" blindly" without considering visual context …

被引用次数：74 相关文章所有 4 个版本

[PDF] thecvf.com

Cycle-consistency for robust visual question answering

M Shah, X Chen, M Rohrbach… - Proceedings of the …, 2019 - openaccess.thecvf.com

Despite significant progress in Visual Question Answer-ing over the years, robustness of
today's VQA models leave much to be desired. We introduce a new evaluation protocol and …

被引用次数：179 相关文章所有 7 个版本

[PDF] github.io

Visual question answering

A Nada, M Chen - 2024 International Conference on …, 2024 - ieeexplore.ieee.org

Visual question answering (VQA) is an artificial intelligence (AI) and computer vision (CV)
comprehensive task to answer questions about the visual content of an image, such as …

被引用次数：1 相关文章所有 2 个版本

[PDF] thecvf.com

Contrast and classify: Training robust vqa models

Y Kant, A Moudgil, D Batra, D Parikh… - Proceedings of the …, 2021 - openaccess.thecvf.com

Abstract Recent Visual Question Answering (VQA) models have shown impressive
performance on the VQA benchmark but remain sensitive to small linguistic variations in …

被引用次数：28 相关文章所有 5 个版本