Negative Pre-aware for Noisy Cross-Modal Matching
Cross-modal noise-robust learning is a challenging task since noisy correspondence is hard
to recognize and rectify. Due to the cumulative and unavoidable negative impact of …
to recognize and rectify. Due to the cumulative and unavoidable negative impact of …
A Comprehensive Survey on Evidential Deep Learning and Its Applications
Reliable uncertainty estimation has become a crucial requirement for the industrial
deployment of deep learning algorithms, particularly in high-risk applications such as …
deployment of deep learning algorithms, particularly in high-risk applications such as …
MS-Former: Memory-Supported Transformer for Weakly Supervised Change Detection with Patch-Level Annotations
Fully supervised change detection (CD) methods have achieved significant advancements
in performance, yet they depend severely on acquiring costly pixel-level labels. Considering …
in performance, yet they depend severely on acquiring costly pixel-level labels. Considering …
Multimodal LLM Enhanced Cross-lingual Cross-modal Retrieval
Cross-lingual cross-modal retrieval (CCR) aims to retrieve visually relevant content based
on non-English queries, without relying on human-labeled cross-modal data pairs during …
on non-English queries, without relying on human-labeled cross-modal data pairs during …
Revisiting Essential and Nonessential Settings of Evidential Deep Learning
M Chen, J Gao, C Xu - arXiv preprint arXiv:2410.00393, 2024 - arxiv.org
Evidential Deep Learning (EDL) is an emerging method for uncertainty estimation that
provides reliable predictive uncertainty in a single forward pass, attracting significant …
provides reliable predictive uncertainty in a single forward pass, attracting significant …
Overcoming the Pitfalls of Vision-Language Model for Image-Text Retrieval
This work tackles the persistent challenge of image-text retrieval, a key problem at the
intersection of computer vision and natural language processing. Despite significant …
intersection of computer vision and natural language processing. Despite significant …
One-step Noisy Label Mitigation
Mitigating the detrimental effects of noisy labels on the training process has become
increasingly critical, as obtaining entirely clean or human-annotated samples for large-scale …
increasingly critical, as obtaining entirely clean or human-annotated samples for large-scale …
[HTML][HTML] Revamping Image-Recipe Cross-Modal Retrieval with Dual Cross Attention Encoders
W Liu, S Yuan, Z Wang, X Chang, L Gao, Z Zhang - Mathematics, 2024 - mdpi.com
The image-recipe cross-modal retrieval task, which retrieves the relevant recipes according
to food images and vice versa, is now attracting widespread attention. There are two main …
to food images and vice versa, is now attracting widespread attention. There are two main …
Fine-grained Feature Assisted Cross-modal Image-text Retrieval
Cross-modal image-text retrieval is a challenging task due to the inherent ambiguity
between modalities. However, most existing methods formulate this problem either with the …
between modalities. However, most existing methods formulate this problem either with the …