Learning semantic relationship among instances for image-text matching
Image-text matching, a bridge connecting image and language, is an important task, which
generally learns a holistic cross-modal embedding to achieve a high-quality semantic …
generally learns a holistic cross-modal embedding to achieve a high-quality semantic …
Reservoir computing transformer for image-text retrieval
Although the attention mechanism in transformers has proven successful in image-text
retrieval tasks, most transformer models suffer from a large number of parameters. Inspired …
retrieval tasks, most transformer models suffer from a large number of parameters. Inspired …
SEMScene: Semantic-consistency enhanced multi-level scene graph matching for image-text retrieval
Image-text retrieval, a fundamental cross-modal task, performs similarity reasoning for
images and texts. The primary challenge for image-text retrieval is cross-modal semantic …
images and texts. The primary challenge for image-text retrieval is cross-modal semantic …
Metasql: A generate-then-rank framework for natural language to sql translation
The Natural Language Interface to Databases (NLIDB) empowers non-technical users with
database access through intuitive natural language (NL) interactions. Advanced …
database access through intuitive natural language (NL) interactions. Advanced …
Cross-modal independent matching network for image-text retrieval
X Ke, B Chen, X Yang, Y Cai, H Liu, W Guo - Pattern Recognition, 2025 - Elsevier
Image-text retrieval serves as a bridge connecting vision and language. Mainstream modal
cross matching methods can effectively perform cross-modal interactions with high …
cross matching methods can effectively perform cross-modal interactions with high …
DCL-net: Dual-level correlation learning network for image–text retrieval
Z Liu, A Li, J Xu, D Shi - Computers and Electrical Engineering, 2025 - Elsevier
Due to the inconsistency in feature representations between different modalities, known as
the “Heterogeneous gap”, image–text retrieval (ITR) is a challenging task. To bridge this …
the “Heterogeneous gap”, image–text retrieval (ITR) is a challenging task. To bridge this …
Negative sample is negative in its own way: Tailoring negative sentences for image-text retrieval
Matching model is essential for Image-Text Retrieval framework. Existing research usually
train the model with a triplet loss and explore various strategy to retrieve hard negative …
train the model with a triplet loss and explore various strategy to retrieve hard negative …
A unified continuous learning framework for multi-modal knowledge discovery and pre-training
Multi-modal pre-training and knowledge discovery are two important research topics in multi-
modal machine learning. Nevertheless, none of existing works make attempts to link …
modal machine learning. Nevertheless, none of existing works make attempts to link …
Joint Intra & Inter-Grained Reasoning: A New Look Into Semantic Consistency of Image-Text Retrieval
Multimodal understanding aims at constructing semantic correlations among modalities of
data while performing various downstream tasks. As one of the primary multimodal …
data while performing various downstream tasks. As one of the primary multimodal …
Improving Image-Text Matching by Integrating Word Sense Disambiguation
This letter presents a novel approach to enhance image-text matching by incorporating word
sense disambiguation (WSD) within the text encoder. Our method explicitly models the …
sense disambiguation (WSD) within the text encoder. Our method explicitly models the …