Lpf: A language-prior feedback objective function for de-biased visual question answering

L Chen, Y Zheng, J Xiao - European conference on computer vision, 2022 - Springer

Data Augmentation (DA)—generating extra training samples beyond the original training set—
has been widely-used in today's unbiased VQA models to mitigate language biases. Current …

被引用次数：40 相关文章所有 5 个版本

[PDF] arxiv.org

Language bias in visual question answering: A survey and taxonomy

D Yuan - arXiv preprint arXiv:2111.08531, 2021 - arxiv.org

Visual question answering (VQA) is a challenging task, which has attracted more and more
attention in the field of computer vision and natural language processing. However, the …

被引用次数：15 相关文章所有 2 个版本

[PDF] ieee.org

A critical analysis of benchmarks, techniques, and models in medical visual question answering

S Al-Hadhrami, MEB Menai, S Al-Ahmadi… - IEEE …, 2023 - ieeexplore.ieee.org

This paper comprehensively reviews medical VQA models, structures, and datasets,
focusing on combining vision and language. Over 75 models and their statistical and SWOT …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Towards robust visual question answering: Making the most of biased samples via contrastive learning

Q Si, Y Liu, F Meng, Z Lin, P Fu, Y Cao, W Wang… - arXiv preprint arXiv …, 2022 - arxiv.org

Models for Visual Question Answering (VQA) often rely on the spurious correlations, ie, the
language priors, that appear in the biased samples of training set, which make them brittle …

被引用次数：16 相关文章所有 3 个版本

Be flexible! learn to debias by sampling and prompting for robust visual question answering

J Liu, CF Fan, F Zhou, H Xu - Information Processing & Management, 2023 - Elsevier

Recent studies point out that VQA models tend to rely on the language prior in the training
data to answer the questions, which prevents the VQA model from generalization on the out …

被引用次数：8 相关文章所有 3 个版本

Question-conditioned debiasing with focal visual context fusion for visual question answering

J Liu, GX Wang, CF Fan, F Zhou, HJ Xu - Knowledge-Based Systems, 2023 - Elsevier

Abstract Existing Visual Question Answering models suffer from the language prior, where
the answers provided by the models overly rely on the correlations between questions and …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org

Language prior is not the only shortcut: A benchmark for shortcut learning in vqa

Q Si, F Meng, M Zheng, Z Lin, Y Liu, P Fu… - arXiv preprint arXiv …, 2022 - arxiv.org

Visual Question Answering (VQA) models are prone to learn the shortcut solution formed by
dataset biases rather than the intended solution. To evaluate the VQA models' reasoning …

被引用次数：15 相关文章所有 3 个版本

[PDF] aclanthology.org

Digging out discrimination information from generated samples for robust visual question answering

Z Wen, Y Wang, M Tan, Q Wu, Q Wu - Findings of the Association …, 2023 - aclanthology.org

Abstract Visual Question Answering (VQA) aims to answer a textual question based on a
given image. Nevertheless, recent studies have shown that VQA models tend to capture the …

被引用次数：5 相关文章所有 2 个版本

[PDF] mlr.press

Overcoming language priors for visual question answering via loss rebalancing label and global context

R Cao, Z Li - Uncertainty in Artificial Intelligence, 2023 - proceedings.mlr.press

Despite the advances in Visual Question Answering (VQA), many VQA models currently
suffer from language priors (ie generating answers directly from questions without using …

被引用次数：2 相关文章所有 5 个版本

[PDF] arxiv.org

Robust visual question answering: Datasets, methods, and future challenges

J Ma, P Wang, D Kong, Z Wang, J Liu… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Visual question answering requires a system to provide an accurate natural language
answer given an image and a natural language question. However, it is widely recognized …

被引用次数：2 相关文章所有 6 个版本