SelfGraphVQA: a self-supervised graph neural network for scene-based question answering

BC de Oliveira Souza, M Aasan… - Proceedings of the …, 2023 - openaccess.thecvf.com
The intersection of vision and language is of major interest due to the increased focus on
seamless integration between recognition and reasoning. Scene graphs (SGs) have …

Vqa therapy: Exploring answer differences by visually grounding answers

C Chen, S Anjum, D Gurari - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Visual question answering is a task of predicting the answer to a question about an image.
Given that different people can provide different answers to a visual question, we aim to …

What's in a name? answer equivalence for open-domain question answering

C Si, C Zhao, J Boyd-Graber - arXiv preprint arXiv:2109.05289, 2021 - arxiv.org
A flaw in QA evaluation is that annotations often only provide one gold answer. Thus, model
predictions semantically equivalent to the answer but superficially different are considered …

Towards Addressing the Misalignment of Object Proposal Evaluation for Vision-Language Tasks via Semantic Grounding

J Feinglass, Y Yang - Proceedings of the IEEE/CVF Winter …, 2024 - openaccess.thecvf.com
Object proposal generation serves as a standard pre-processing step in Vision-Language
(VL) tasks (image captioning, visual question answering, etc.). The performance of object …

Towards answering open-ended ethical quandary questions

Y Bang, N Lee, T Yu, L Khalatbari, Y Xu… - arXiv preprint arXiv …, 2022 - arxiv.org
Considerable advancements have been made in various NLP tasks based on the
impressive power of large language models (LLMs) and many NLP applications are …