Negative Pre-aware for Noisy Cross-Modal Matching

X Zhang, H Li, M Ye - Proceedings of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org
Cross-modal noise-robust learning is a challenging task since noisy correspondence is hard
to recognize and rectify. Due to the cumulative and unavoidable negative impact of …

A Comprehensive Survey on Evidential Deep Learning and Its Applications

J Gao, M Chen, L Xiang, C Xu - arXiv preprint arXiv:2409.04720, 2024 - arxiv.org
Reliable uncertainty estimation has become a crucial requirement for the industrial
deployment of deep learning algorithms, particularly in high-risk applications such as …

MS-Former: Memory-Supported Transformer for Weakly Supervised Change Detection with Patch-Level Annotations

Z Li, C Tang, X Liu, C Li, X Li… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Fully supervised change detection (CD) methods have achieved significant advancements
in performance, yet they depend severely on acquiring costly pixel-level labels. Considering …

Multimodal LLM Enhanced Cross-lingual Cross-modal Retrieval

Y Wang, L Wang, Q Zhou, Z Wang, H Li, G Hua… - Proceedings of the …, 2024 - dl.acm.org
Cross-lingual cross-modal retrieval (CCR) aims to retrieve visually relevant content based
on non-English queries, without relying on human-labeled cross-modal data pairs during …

Revisiting Essential and Nonessential Settings of Evidential Deep Learning

M Chen, J Gao, C Xu - arXiv preprint arXiv:2410.00393, 2024 - arxiv.org
Evidential Deep Learning (EDL) is an emerging method for uncertainty estimation that
provides reliable predictive uncertainty in a single forward pass, attracting significant …

Overcoming the Pitfalls of Vision-Language Model for Image-Text Retrieval

F Zhang, S Qu, F Shi, C Xu - Proceedings of the 32nd ACM International …, 2024 - dl.acm.org
This work tackles the persistent challenge of image-text retrieval, a key problem at the
intersection of computer vision and natural language processing. Despite significant …

One-step Noisy Label Mitigation

H Li, J Gu, J Song, A Zhang, L Gao - arXiv preprint arXiv:2410.01944, 2024 - arxiv.org
Mitigating the detrimental effects of noisy labels on the training process has become
increasingly critical, as obtaining entirely clean or human-annotated samples for large-scale …

[HTML][HTML] Revamping Image-Recipe Cross-Modal Retrieval with Dual Cross Attention Encoders

W Liu, S Yuan, Z Wang, X Chang, L Gao, Z Zhang - Mathematics, 2024 - mdpi.com
The image-recipe cross-modal retrieval task, which retrieves the relevant recipes according
to food images and vice versa, is now attracting widespread attention. There are two main …

Fine-grained Feature Assisted Cross-modal Image-text Retrieval

C Bu, X Liu, Z Huang, Y Su, J Tu, R Hong - Chinese Conference on …, 2024 - Springer
Cross-modal image-text retrieval is a challenging task due to the inherent ambiguity
between modalities. However, most existing methods formulate this problem either with the …