Visual question answering: A survey of methods and datasets
Abstract Visual Question Answering (VQA) is a challenging task that has received increasing
attention from both the computer vision and the natural language processing communities …
attention from both the computer vision and the natural language processing communities …
Knowledge base graph embedding module design for Visual question answering model
In this paper, a knowledge base graph embedding module is constructed to extend the
versatility of knowledge-based VQA (Visual Question Answering) models. The knowledge …
versatility of knowledge-based VQA (Visual Question Answering) models. The knowledge …
Don't just assume; look and answer: Overcoming priors for visual question answering
A number of studies have found that today's Visual Question Answering (VQA) models are
heavily driven by superficial correlations in the training data and lack sufficient image …
heavily driven by superficial correlations in the training data and lack sufficient image …
Tips and tricks for visual question answering: Learnings from the 2017 challenge
This paper presents a state-of-the-art model for visual question answering (VQA), which won
the first place in the 2017 VQA Challenge. VQA is a task of significant importance for …
the first place in the 2017 VQA Challenge. VQA is a task of significant importance for …
Fvqa: Fact-based visual question answering
Visual Question Answering (VQA) has attracted much attention in both computer vision and
natural language processing communities, not least because it offers insight into the …
natural language processing communities, not least because it offers insight into the …
Graph-structured representations for visual question answering
This paper proposes to improve visual question answering (VQA) with structured
representations of both scene contents and questions. A key challenge in VQA is to require …
representations of both scene contents and questions. A key challenge in VQA is to require …
Out of the box: Reasoning with graph convolution nets for factual visual question answering
M Narasimhan, S Lazebnik… - Advances in neural …, 2018 - proceedings.neurips.cc
Accurately answering a question about a given image requires combining observations with
general knowledge. While this is effortless for humans, reasoning with general knowledge …
general knowledge. While this is effortless for humans, reasoning with general knowledge …
Image captioning and visual question answering based on attributes and external knowledge
Much of the recent progress in Vision-to-Language problems has been achieved through a
combination of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks …
combination of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks …
Analyzing the behavior of visual question answering models
Recently, a number of deep-learning based models have been proposed for the task of
Visual Question Answering (VQA). The performance of most models is clustered around 60 …
Visual Question Answering (VQA). The performance of most models is clustered around 60 …
Simple baseline for visual question answering
We describe a very simple bag-of-words baseline for visual question answering. This
baseline concatenates the word features from the question and CNN features from the …
baseline concatenates the word features from the question and CNN features from the …