Zero-shot cross-media embedding learning with dual adversarial distribution network

GS Xie, Z Zhang, H Xiong, L Shao… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Zero-shot learning (ZSL), an emerging topic in recent years, targets at distinguishing unseen
class images by taking images from seen classes for training the classifier. Existing works …

被引用次数：31 相关文章所有 2 个版本

[PDF] github.io

Deep multimodal transfer learning for cross-modal retrieval

L Zhen, P Hu, X Peng, RSM Goh… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org

Cross-modal retrieval (CMR) enables flexible retrieval experience across different
modalities (eg, texts versus images), which maximally benefits us from the abundance of …

被引用次数：78 相关文章所有 5 个版本

Joint feature synthesis and embedding: Adversarial cross-modal retrieval revisited

X Xu, K Lin, Y Yang, A Hanjalic… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org

Recently, generative adversarial network (GAN) has shown its strong ability on modeling
data distribution via adversarial learning. Cross-modal GAN, which attempts to utilize the …

被引用次数：72 相关文章所有 6 个版本

[PDF] mdpi.com

TextControlGAN: Text-to-image synthesis with controllable generative adversarial networks

H Ku, M Lee - Applied Sciences, 2023 - mdpi.com

Generative adversarial networks (GANs) have demonstrated remarkable potential in the
realm of text-to-image synthesis. Nevertheless, conventional GANs employing conditional …

被引用次数：33 相关文章所有 6 个版本

[PDF] github.io

Adversarial-metric learning for audio-visual cross-modal matching

A Zheng, M Hu, B Jiang, Y Huang… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org

Audio-visual matching aims to learn the intrinsic correspondence between image and audio
clip. Existing works mainly concentrate on learning discriminative features, while ignore the …

被引用次数：48 相关文章所有 4 个版本

Discriminative and robust attribute alignment for zero-shot learning

D Cheng, G Wang, N Wang, D Zhang… - … on Circuits and …, 2023 - ieeexplore.ieee.org

Zero-shot learning (ZSL) aims to learn models that can recognize images of semantically
related unseen categories, through transferring attribute-based knowledge learned from …

被引用次数：24 相关文章所有 2 个版本

Region reinforcement network with topic constraint for image-text matching

J Wu, C Wu, J Lu, L Wang, X Cui - IEEE Transactions on …, 2021 - ieeexplore.ieee.org

Image and sentence matching has attracted increasing attention since it is associated with
two important modalities of vision and language. Previous methods aim to find the latent …

被引用次数：41 相关文章所有 2 个版本

Bridge-GAN: Interpretable representation learning for text-to-image synthesis

M Yuan, Y Peng - IEEE Transactions on Circuits and Systems …, 2019 - ieeexplore.ieee.org

Text-to-image synthesis is to generate images with the consistent content as the given text
description, which is a highly challenging task with two main issues: visual reality and …

被引用次数：64 相关文章

Image-text retrieval with cross-modal semantic importance consistency

Z Liu, F Chen, J Xu, W Pei, G Lu - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Cross-modal image-text retrieval is an important area of Vision-and-Language task that
models the similarity of image-text pairs by embedding features into a shared space for …

被引用次数：18 相关文章所有 2 个版本

Dual-aligned feature confusion alleviation for generalized zero-shot learning

H Su, J Li, K Lu, L Zhu, HT Shen - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Generalized zero-shot learning (GZSL) aims to recognize both seen and unseen samples by
leveraging the connections between semantic and visual representations. Recently, a …

被引用次数：14 相关文章所有 2 个版本