Fine-grained image-text matching by cross-modal hard aligning network

Z Pan, F Wu, B Zhang - … of the IEEE/CVF conference on …, 2023 - openaccess.thecvf.com
Current state-of-the-art image-text matching methods implicitly align the visual-semantic
fragments, like regions in images and words in sentences, and adopt cross-attention …

Learning semantic relationship among instances for image-text matching

Z Fu, Z Mao, Y Song, Y Zhang - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Image-text matching, a bridge connecting image and language, is an important task, which
generally learns a holistic cross-modal embedding to achieve a high-quality semantic …

Cross-modal active complementary learning with self-refining correspondence

Y Qin, Y Sun, D Peng, JT Zhou… - Advances in Neural …, 2023 - proceedings.neurips.cc
Recently, image-text matching has attracted more and more attention from academia and
industry, which is fundamental to understanding the latent correspondence across visual …

Cross-modal semantic enhanced interaction for image-sentence retrieval

X Ge, F Chen, S Xu, F Tao… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Image-sentence retrieval has attracted extensive research attention in multimedia and
computer vision due to its promising application. The key issue lies in jointly learning the …

Cross-Modal Retrieval: A Review of Methodologies, Datasets, and Future Perspectives

Z Han, A Azman, MR Mustaffa, FB Khalid - IEEE Access, 2024 - ieeexplore.ieee.org
With the rapid development of science and technology, all types of mixed media contain
large amounts of data. Traditional single multimedia data can no longer satisfy daily …

MKVSE: Multimodal knowledge enhanced visual-semantic embedding for image-text retrieval

D Feng, X He, Y Peng - ACM Transactions on Multimedia Computing …, 2023 - dl.acm.org
Image-text retrieval aims to take the text (image) query to retrieve the semantically relevant
images (texts), which is fundamental and critical in the search system, online shopping, and …

Efficient token-guided image-text retrieval with consistent multimodal contrastive training

C Liu, Y Zhang, H Wang, W Chen… - … on Image Processing, 2023 - ieeexplore.ieee.org
Image-text retrieval is a central problem for understanding the semantic relationship
between vision and language, and serves as the basis for various visual and language …

Neuron-based spiking transmission and reasoning network for robust image-text retrieval

W Li, Z Ma, LJ Deng, X Fan… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Most of the image-text retrieval methods carry out accurate results using fine-grained
features for feature alignment. However, extracting the robustness features while …

Breaking Through the Noisy Correspondence: A Robust Model for Image-Text Matching

H Shi, M Liu, X Mu, X Song, Y Hu, L Nie - ACM Transactions on …, 2024 - dl.acm.org
Unleashing the power of image-text matching in real-world applications is hampered by
noisy correspondence. Manually curating high-quality datasets is expensive and time …

Amc: Adaptive multi-expert collaborative network for text-guided image retrieval

H Zhu, Y Wei, Y Zhao, C Zhang, S Huang - ACM Transactions on …, 2023 - dl.acm.org
Text-guided image retrieval integrates reference image and text feedback as a multimodal
query to search the image corresponding to user intention. Recent approaches employ multi …