A review on multimodal zero‐shot learning

W Cao, Y Wu, Y Sun, H Zhang, J Ren… - … : Data Mining and …, 2023 - Wiley Online Library
Multimodal learning provides a path to fully utilize all types of information related to the
modeling target to provide the model with a global vision. Zero‐shot learning (ZSL) is a …

Ternary adversarial networks with self-supervision for zero-shot cross-modal retrieval

X Xu, H Lu, J Song, Y Yang… - IEEE transactions on …, 2019 - ieeexplore.ieee.org
Given a query instance from one modality (eg, image), cross-modal retrieval aims to find
semantically similar instances from another modality (eg, text). To perform cross-modal …

Deep multimodal transfer learning for cross-modal retrieval

L Zhen, P Hu, X Peng, RSM Goh… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Cross-modal retrieval (CMR) enables flexible retrieval experience across different
modalities (eg, texts versus images), which maximally benefits us from the abundance of …

Joint feature synthesis and embedding: Adversarial cross-modal retrieval revisited

X Xu, K Lin, Y Yang, A Hanjalic… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Recently, generative adversarial network (GAN) has shown its strong ability on modeling
data distribution via adversarial learning. Cross-modal GAN, which attempts to utilize the …

Learning cross-aligned latent embeddings for zero-shot cross-modal retrieval

K Lin, X Xu, L Gao, Z Wang, HT Shen - Proceedings of the AAAI …, 2020 - ojs.aaai.org
Abstract Zero-Shot Cross-Modal Retrieval (ZS-CMR) is an emerging research hotspot that
aims to retrieve data of new classes across different modality data. It is challenging for not …

Pan: Prototype-based adaptive network for robust cross-modal retrieval

Z Zeng, S Wang, N Xu, W Mao - … of the 44th international ACM SIGIR …, 2021 - dl.acm.org
In practical applications of cross-modal retrieval, test queries of the retrieval system may vary
greatly and come from unknown category. Meanwhile, due to the cost and difficulty of data …

Zero-shot cross-media embedding learning with dual adversarial distribution network

J Chi, Y Peng - IEEE Transactions on Circuits and Systems for …, 2019 - ieeexplore.ieee.org
Existing cross-media retrieval methods are mainly based on the condition where the training
set covers all the categories in the testing set, which lack extensibility to retrieve data of new …

Zero-shot cross-modal retrieval by assembling autoencoder and generative adversarial network

X Xu, J Tian, K Lin, H Lu, J Shao, HT Shen - ACM Transactions on …, 2021 - dl.acm.org
Conventional cross-modal retrieval models mainly assume the same scope of the classes
for both the training set and the testing set. This assumption limits their extensibility on zero …

Alignment efficient image-sentence retrieval considering transferable cross-modal representation learning

Y Yang, J Guo, G Li, L Li, W Li, J Yang - Frontiers of Computer Science, 2024 - Springer
Traditional image-sentence cross-modal retrieval methods usually aim to learn consistent
representations of heterogeneous modalities, thereby to search similar instances in one …

Multimodal disentanglement variational autoencoders for zero-shot cross-modal retrieval

J Tian, K Wang, X Xu, Z Cao, F Shen… - Proceedings of the 45th …, 2022 - dl.acm.org
Zero-Shot Cross-Modal Retrieval (ZS-CMR) has recently drawn increasing attention as it
focuses on a practical retrieval scenario, ie, the multimodal test set consists of unseen …