Multimodal research in vision and language: A review of current and emerging trends

S Uppal, S Bhagat, D Hazarika, N Majumder, S Poria… - Information …, 2022 - Elsevier
Deep Learning and its applications have cascaded impactful research and development
with a diverse range of modalities present in the real-world data. More recently, this has …

A survey and taxonomy of adversarial neural networks for text‐to‐image synthesis

J Agnese, J Herrera, H Tao… - … Reviews: Data Mining and …, 2020 - Wiley Online Library
Text‐to‐image synthesis refers to computational methods which translate human written
textual descriptions, in the form of keywords or sentences, into images with similar semantic …

Application of integrated steganography and image compressing techniques for confidential information transmission

BK Pandey, D Pandey, S Wairya… - Cyber security and …, 2022 - Wiley Online Library
In the present day, images and videos account for nearly 80% of all the data transmitted
during our daily activities. This work employs a combination of novel stegnography and data …

RiFeGAN: Rich feature generation for text-to-image synthesis from prior knowledge

J Cheng, F Wu, Y Tian, L Wang… - Proceedings of the IEEE …, 2020 - openaccess.thecvf.com
Text-to-image synthesis is a challenging task that generates realistic images from a textual
sequence, which usually contains limited information compared with the corresponding …

Dual alignment unsupervised domain adaptation for video-text retrieval

X Hao, W Zhang, D Wu, F Zhu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Video-text retrieval is an emerging stream in both computer vision and natural language
processing communities, which aims to find relevant videos given text queries. In this paper …

Knowing what to learn: a metric-oriented focal mechanism for image captioning

J Ji, Y Ma, X Sun, Y Zhou, Y Wu… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Despite considerable progress, image captioning still suffers from the huge difference in
quality between easy and hard examples, which is left unexploited in existing methods. To …

CKD: Cross-task knowledge distillation for text-to-image synthesis

M Yuan, Y Peng - IEEE Transactions on Multimedia, 2019 - ieeexplore.ieee.org
Text-to-image synthesis (T2IS) has drawn increasing interest recently, which can
automatically generate images conditioned on text descriptions. It is a highly challenging …

Recent advances of image steganography with generative adversarial networks

J Liu, Y Ke, Z Zhang, Y Lei, J Li, M Zhang… - IEEE Access, 2020 - ieeexplore.ieee.org
In the past few years, the Generative Adversarial Network (GAN), which proposed in 2014,
has achieved great success. There have been increasing research achievements based on …

Semi-supervised medical report generation via graph-guided hybrid feature consistency

K Zhang, H Jiang, J Zhang, Q Huang… - IEEE Transactions …, 2023 - ieeexplore.ieee.org
Medical report generation generates the corresponding report according to the given
radiology image, which has been attracting increasing research interest. However, existing …

Image caption generation using visual attention prediction and contextual spatial relation extraction

R Sasibhooshan, S Kumaraswamy, S Sasidharan - Journal of Big Data, 2023 - Springer
Automatic caption generation with attention mechanisms aims at generating more
descriptive captions containing coarser to finer semantic contents in the image. In this work …