Lexlip: Lexicon-bottlenecked language-image pre-training for large-scale image-text sparse retrieval

Z Luo, P Zhao, C Xu, X Geng, T Shen… - Proceedings of the …, 2023 - openaccess.thecvf.com
Image-text retrieval (ITR) aims to retrieve images or texts that match a query originating from
the other modality. The conventional dense retrieval paradigm relies on encoding images …

CROMA: Remote sensing representations with contrastive radar-optical masked autoencoders

A Fuller, K Millard, J Green - Advances in Neural …, 2024 - proceedings.neurips.cc
A vital and rapidly growing application, remote sensing offers vast yet sparsely labeled,
spatially aligned multimodal data; this makes self-supervised learning algorithms invaluable …

Balance act: Mitigating hubness in cross-modal retrieval with query and gallery banks

Y Wang, X Jian, B Xue - arXiv preprint arXiv:2310.11612, 2023 - arxiv.org
In this work, we present a post-processing solution to address the hubness problem in cross-
modal retrieval, a phenomenon where a small number of gallery data points are frequently …

LexLIP: lexicon-bottlenecked language-image pre-training for large-scale image-text retrieval

P Zhao, C Xu, X Geng, T Shen, C Tao, J Ma… - arXiv preprint arXiv …, 2023 - arxiv.org
Image-text retrieval (ITR) is a task to retrieve the relevant images/texts, given the query from
another modality. The conventional dense retrieval paradigm relies on encoding images …

CSDNet: Contrastive Similarity Distillation Network for Multi-lingual Image-Text Retrieval

S Lu, L Guo, X He, X Zhu, J Liu, S Liu - International Conference on Image …, 2023 - Springer
Cross-modal image-text retrieval is a crucial task in the field of vision and language, aimed
at retrieving the relevant samples from one modality as per the given user expressed in …

Self-supervised Pretraining of Vision Transformers for Earth Observation

A Fuller - 2023 - repository.library.carleton.ca
Remote sensing offers vast yet sparsely labeled multimodal data but lacks foundation
models that can be leveraged across societally impactful applications. In this thesis, I …