Constructing phrase-level semantic labels to form multi-grained supervision for image-text retrieval
Existing research for image text retrieval mainly relies on sentence-level supervision to
distinguish matched and mismatched sentences for a query image. However, semantic …
distinguish matched and mismatched sentences for a query image. However, semantic …
Beyond Coarse-Grained Matching in Video-Text Retrieval
Text-to-video retrieval has seen significant advancements, yet the ability of models to
discern subtle differences in captions still requires verification. In this paper, we introduce a …
discern subtle differences in captions still requires verification. In this paper, we introduce a …
[HTML][HTML] The impact of hard and easy negative training data on vulnerability prediction performance
Vulnerability prediction models have been shown to perform poorly in the real world. We
examine how the composition of negative training data influences vulnerability prediction …
examine how the composition of negative training data influences vulnerability prediction …
A unified continuous learning framework for multi-modal knowledge discovery and pre-training
Multi-modal pre-training and knowledge discovery are two important research topics in multi-
modal machine learning. Nevertheless, none of existing works make attempts to link …
modal machine learning. Nevertheless, none of existing works make attempts to link …
Flickr30K-CFQ: A Compact and Fragmented Query Dataset for Text-image Retrieval
With the explosive growth of multi-modal information on the Internet, unimodal search
cannot satisfy the requirement of Internet applications. Text-image retrieval research is …
cannot satisfy the requirement of Internet applications. Text-image retrieval research is …
AsCL: An Asymmetry-sensitive Contrastive Learning Method for Image-Text Retrieval with Cross-Modal Fusion
Z Gong, C Mai, Y Huang - arXiv preprint arXiv:2405.10029, 2024 - arxiv.org
The image-text retrieval task aims to retrieve relevant information from a given image or text.
The main challenge is to unify multimodal representation and distinguish fine-grained …
The main challenge is to unify multimodal representation and distinguish fine-grained …
[PDF][PDF] The Journal of Systems & Software
Vulnerability prediction models have been shown to perform poorly in the real world. We
examine how the composition of negative training data influences vulnerability prediction …
examine how the composition of negative training data influences vulnerability prediction …