Counterfactual Mix-up for visual question answering

文章

学术资源搜索

获得 4 条结果（用时0.13秒）

我的图书馆

Counterfactual Mix-up for visual question answering

在引用文章中搜索

[PDF] ieee.org

Semi-supervised image captioning by adversarially propagating labeled data

DJ Kim, TH Oh, J Choi, IS Kweon - IEEE Access, 2024 - ieeexplore.ieee.org

We present a novel data-efficient semi-supervised framework to improve the generalization
of image captioning models. Constructing a large-scale labeled image captioning dataset is …

被引用次数：6 相关文章所有 4 个版本

[PDF] arxiv.org

Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality

Y Oh, JW Cho, DJ Kim, IS Kweon, J Kim - arXiv preprint arXiv:2410.05210, 2024 - arxiv.org

In this paper, we propose a new method to enhance compositional understanding in pre-
trained vision and language models (VLMs) without sacrificing performance in zero-shot …

Unbiased Visual Question Answering by Leveraging Instrumental Variable

Y Pan, J Liu, L Jin, Z Li - IEEE Transactions on Multimedia, 2024 - ieeexplore.ieee.org

Existing unbiased visual question answering (VQA) models reduce the spurious correlation
between questions and answers to force the models to focus on visual information …

被引用次数：2 相关文章

Enhanced Visual Question Answering System Using DenseNet

S Nithish, EM Kawinbalaji… - … on Advances in Data …, 2024 - ieeexplore.ieee.org

Visual Question Answering (VQA) system represents an essential usage of computer vision
and natural language processing, enabling machines to understand and react to inquiries …