[HTML][HTML] Multimodal Contrastive Learning for Remote Sensing Image Feature Extraction Based on Relaxed Positive Samples
Z Zhang, Q Li, W Jing, G He, L Zhu, S Gao - Sensors, 2024 - mdpi.com
Traditional multimodal contrastive learning brings text and its corresponding image closer
together as a positive pair, where the text typically consists of fixed sentence structures or …
together as a positive pair, where the text typically consists of fixed sentence structures or …
Mind the modality gap: Towards a remote sensing vision-language model via cross-modal alignment
Deep Learning (DL) is undergoing a paradigm shift with the emergence of foundation
models, aptly named by their crucial, yet incomplete nature. In this work, we focus on …
models, aptly named by their crucial, yet incomplete nature. In this work, we focus on …
GraphVL: graph-enhanced semantic modeling via vision-language models for generalized class discovery
B Solanki, AR Nair, M Singha… - Proceedings of the …, 2024 - dl.acm.org
Generalized Category Discovery (GCD) aims to cluster unlabeled images into known and
novel categories using labeled images from known classes. To address the challenge of …
novel categories using labeled images from known classes. To address the challenge of …
RS3Lip: Consistency for remote sensing image classification on part embeddings using self-supervised learning and CLIP
Tackling domain and class generalization challenges remains a significant hurdle in the
realm of remote sensing (RS). Recently, large-scale pre-trained vision-language models …
realm of remote sensing (RS). Recently, large-scale pre-trained vision-language models …
Diffusion in Zero-Shot Learning for Environmental Audio
Zero-shot learning enables models to generalize to unseen classes by leveraging semantic
information, bridging the gap between training and testing sets with non-overlapping …
information, bridging the gap between training and testing sets with non-overlapping …