Towards zero-shot learning: A brief review and an attention-based embedding network

GS Xie, Z Zhang, H Xiong, L Shao… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Zero-shot learning (ZSL), an emerging topic in recent years, targets at distinguishing unseen
class images by taking images from seen classes for training the classifier. Existing works …

Deep multimodal transfer learning for cross-modal retrieval

L Zhen, P Hu, X Peng, RSM Goh… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Cross-modal retrieval (CMR) enables flexible retrieval experience across different
modalities (eg, texts versus images), which maximally benefits us from the abundance of …

Joint feature synthesis and embedding: Adversarial cross-modal retrieval revisited

X Xu, K Lin, Y Yang, A Hanjalic… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Recently, generative adversarial network (GAN) has shown its strong ability on modeling
data distribution via adversarial learning. Cross-modal GAN, which attempts to utilize the …

TextControlGAN: Text-to-image synthesis with controllable generative adversarial networks

H Ku, M Lee - Applied Sciences, 2023 - mdpi.com
Generative adversarial networks (GANs) have demonstrated remarkable potential in the
realm of text-to-image synthesis. Nevertheless, conventional GANs employing conditional …

Adversarial-metric learning for audio-visual cross-modal matching

A Zheng, M Hu, B Jiang, Y Huang… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Audio-visual matching aims to learn the intrinsic correspondence between image and audio
clip. Existing works mainly concentrate on learning discriminative features, while ignore the …

Discriminative and robust attribute alignment for zero-shot learning

D Cheng, G Wang, N Wang, D Zhang… - … on Circuits and …, 2023 - ieeexplore.ieee.org
Zero-shot learning (ZSL) aims to learn models that can recognize images of semantically
related unseen categories, through transferring attribute-based knowledge learned from …

Region reinforcement network with topic constraint for image-text matching

J Wu, C Wu, J Lu, L Wang, X Cui - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Image and sentence matching has attracted increasing attention since it is associated with
two important modalities of vision and language. Previous methods aim to find the latent …

Bridge-GAN: Interpretable representation learning for text-to-image synthesis

M Yuan, Y Peng - IEEE Transactions on Circuits and Systems …, 2019 - ieeexplore.ieee.org
Text-to-image synthesis is to generate images with the consistent content as the given text
description, which is a highly challenging task with two main issues: visual reality and …

Image-text retrieval with cross-modal semantic importance consistency

Z Liu, F Chen, J Xu, W Pei, G Lu - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Cross-modal image-text retrieval is an important area of Vision-and-Language task that
models the similarity of image-text pairs by embedding features into a shared space for …

Dual-aligned feature confusion alleviation for generalized zero-shot learning

H Su, J Li, K Lu, L Zhu, HT Shen - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Generalized zero-shot learning (GZSL) aims to recognize both seen and unseen samples by
leveraging the connections between semantic and visual representations. Recently, a …