作者
Ruixue Tang, Chao Ma, Wei Emma Zhang, Qi Wu, Xiaokang Yang
发表日期
2020
研讨会论文
Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIX 16
页码范围
437-453
出版商
Springer International Publishing
简介
Visual Question Answering (VQA) has achieved great success thanks to the fast development of deep neural networks (DNN). On the other hand, the data augmentation, as one of the major tricks for DNN, has been widely used in many computer vision tasks. However, there are few works studying the data augmentation problem for VQA and none of the existing image based augmentation schemes (such as rotation and flipping) can be directly applied to VQA due to its semantic structure – an $$\langle image, question, answer\rangle $$ ⟨ i m a g e , q u e s t i o n , a n s w e r ⟩ triplet needs to be maintained correctly. For …
引用总数
2020202120222023202431214155
学术搜索中的文章
R Tang, C Ma, WE Zhang, Q Wu, X Yang - Computer Vision–ECCV 2020: 16th European …, 2020