Semantic equivalent adversarial data augmentation for visual question answering

S Uppal, S Bhagat, D Hazarika, N Majumder, S Poria… - Information …, 2022 - Elsevier

Deep Learning and its applications have cascaded impactful research and development
with a diverse range of modalities present in the real-world data. More recently, this has …

被引用次数：82 相关文章所有 5 个版本

[PDF] arxiv.org

A survey of data augmentation approaches for NLP

SY Feng, V Gangal, J Wei, S Chandar… - arXiv preprint arXiv …, 2021 - arxiv.org

Data augmentation has recently seen increased interest in NLP due to more work in low-
resource domains, new tasks, and the popularity of large-scale neural networks that require …

被引用次数：736 相关文章所有 9 个版本

[PDF] thecvf.com

Teaching structured vision & language concepts to vision & language models

S Doveh, A Arbelle, S Harary… - Proceedings of the …, 2023 - openaccess.thecvf.com

Vision and Language (VL) models have demonstrated remarkable zero-shot performance in
a variety of tasks. However, some aspects of complex language understanding still remain a …

被引用次数：40 相关文章所有 8 个版本

[PDF] thecvf.com

Mixgen: A new multi-modal data augmentation

X Hao, Y Zhu, S Appalaraju, A Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Data augmentation is a necessity to enhance data efficiency in deep learning. For vision-
language pre-training, data is only augmented either for images or for text in previous works …

被引用次数：62 相关文章所有 8 个版本

Adversarial attack and defense technologies in natural language processing: A survey

S Qiu, Q Liu, S Zhou, W Huang - Neurocomputing, 2022 - Elsevier

Recently, the adversarial attack and defense technology has made remarkable
achievements and has been widely applied in the computer vision field, promoting its rapid …

被引用次数：54 相关文章所有 2 个版本

[PDF] arxiv.org

Rethinking data augmentation for robust visual question answering

L Chen, Y Zheng, J Xiao - European conference on computer vision, 2022 - Springer

Data Augmentation (DA)—generating extra training samples beyond the original training set—
has been widely-used in today's unbiased VQA models to mitigate language biases. Current …

被引用次数：39 相关文章所有 5 个版本

[PDF] thecvf.com

Simvqa: Exploring simulated environments for visual question answering

P Cascante-Bonilla, H Wu, L Wang… - Proceedings of the …, 2022 - openaccess.thecvf.com

Existing work on VQA explores data augmentation to achieve better generalization by
perturbing the images in the dataset or modifying the existing questions and answers. While …

被引用次数：36 相关文章所有 8 个版本

[PDF] thecvf.com

Unshuffling data for improved generalization in visual question answering

D Teney, E Abbasnejad… - Proceedings of the …, 2021 - openaccess.thecvf.com

Generalization beyond the training distribution is a core challenge in machine learning. The
common practice of mixing and shuffling examples when training neural networks may not …

被引用次数：100 相关文章所有 11 个版本

[PDF] google.com

Vqamix: Conditional triplet mixup for medical visual question answering

H Gong, G Chen, M Mao, Z Li… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Medical visual question answering (VQA) aims to correctly answer a clinical question related
to a given medical image. Nevertheless, owing to the expensive manual annotations of …

被引用次数：29 相关文章所有 4 个版本

[PDF] arxiv.org

A survey on VQA: Datasets and approaches

Y Zou, Q Xie - 2020 2nd International Conference on …, 2020 - ieeexplore.ieee.org

Visual question answering (VQA) is a task that combines both the techniques of computer
vision and natural language processing. It requires models to answer a text-based question …

被引用次数：15 相关文章所有 7 个版本