相关文章- 学术资源搜索

Overcoming language priors for visual question answering via loss rebalancing label and global context

R Cao, Z Li - Uncertainty in Artificial Intelligence, 2023 - proceedings.mlr.press

Despite the advances in Visual Question Answering (VQA), many VQA models currently
suffer from language priors (ie generating answers directly from questions without using …

被引用次数：2 相关文章所有 5 个版本

[PDF] arxiv.org

Overcoming language priors with self-supervised learning for visual question answering

X Zhu, Z Mao, C Liu, P Zhang, B Wang… - arXiv preprint arXiv …, 2020 - arxiv.org

Most Visual Question Answering (VQA) models suffer from the language prior problem,
which is caused by inherent data biases. Specifically, VQA models tend to answer questions …

被引用次数：106 相关文章所有 5 个版本

Cross Modality Bias in Visual Question Answering: A Causal View with Possible Worlds VQA

A Vosoughi, S Deng, S Zhang, Y Tian… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

To increase the generalization capability of VQA systems, many recent studies have tried to
de-bias spurious language or vision associations that shortcut the question or image to the …

被引用次数：1 相关文章所有 2 个版本

[PDF] ieee.org

Overcoming language priors via shuffling language bias for robust visual question answering

J Zhao, Z Yu, X Zhang, Y Yang - IEEE Access, 2023 - ieeexplore.ieee.org

Recent research has revealed the notorious language prior problem in visual question
answering (VQA) tasks based on visual-textual interaction, which indicates that well …

被引用次数：3 相关文章所有 2 个版本

[HTML] sciencedirect.com

[HTML][HTML] Robust visual question answering via semantic cross modal augmentation

A Mashrur, W Luo, NA Zaidi, A Robles-Kelly - Computer Vision and Image …, 2024 - Elsevier

Recent advances in vision-language models have resulted in improved accuracy in visual
question answering (VQA) tasks. However, their robustness remains limited when faced with …

被引用次数：3 相关文章所有 4 个版本

[PDF] acm.org

Multi-stage reasoning on introspecting and revising bias for visual question answering

AA Liu, Z Lu, N Xu, M Liu, C Yan, B Zheng… - ACM Transactions on …, 2023 - dl.acm.org

Visual Question Answering (VQA) is a task that involves predicting an answer to a question
depending on the content of an image. However, recent VQA methods have relied more on …

[PDF] arxiv.org

Towards robust visual question answering: Making the most of biased samples via contrastive learning

Q Si, Y Liu, F Meng, Z Lin, P Fu, Y Cao, W Wang… - arXiv preprint arXiv …, 2022 - arxiv.org

Models for Visual Question Answering (VQA) often rely on the spurious correlations, ie, the
language priors, that appear in the biased samples of training set, which make them brittle …

被引用次数：16 相关文章所有 3 个版本

Vqa-bc: Robust visual question answering via bidirectional chaining

M Lao, Y Guo, W Chen, N Pu… - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org

Current VQA models are suffering from the problem of overdependence on language bias,
which severely reduces their robustness in real-world scenarios. In this paper, we analyze …

被引用次数：4 相关文章

Suppressing biased samples for robust VQA

N Ouyang, Q Huang, P Li, Y Cai, B Liu… - IEEE Transactions …, 2021 - ieeexplore.ieee.org

Most existing visual question answering (VQA) models strongly rely on language bias to
answer questions, ie, they always tend to fit question-answer pairs on the train split and …

被引用次数：22 相关文章所有 3 个版本

Debiased Visual Question Answering via the perspective of question types

T Huai, S Yang, J Zhang, J Zhao, L He - Pattern Recognition Letters, 2024 - Elsevier

Abstract Visual Question Answering (VQA) aims to answer questions according to the given
image. However, current VQA models tend to rely solely on textual information from the …

被引用次数：2 相关文章所有 3 个版本