Scalable deep multimodal learning for cross-modal retrieval

From google gemini to openai q*(q-star): A survey of reshaping the generative artificial intelligence (ai) research landscape

TR McIntosh, T Susnjak, T Liu, P Watters… - arXiv preprint arXiv …, 2023 - arxiv.org

This comprehensive survey explored the evolving landscape of generative Artificial
Intelligence (AI), with a specific focus on the transformative impacts of Mixture of Experts …

被引用次数：113 相关文章所有 3 个版本

[PDF] bournemouth.ac.uk

Comparative analysis on cross-modal information retrieval: A review

P Kaur, HS Pannu, AK Malhi - Computer Science Review, 2021 - Elsevier

Human beings experience life through a spectrum of modes such as vision, taste, hearing,
smell, and touch. These multiple modes are integrated for information processing in our …

被引用次数：107 相关文章所有 3 个版本

Unsupervised contrastive cross-modal hashing

P Hu, H Zhu, J Lin, D Peng, YP Zhao… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

In this paper, we study how to make unsupervised cross-modal hashing (CMH) benefit from
contrastive learning (CL) by overcoming two challenges. To be exact, i) to address the …

被引用次数：120 相关文章所有 4 个版本

[PDF] researchgate.net

Dynamic modality interaction modeling for image-text retrieval

L Qu, M Liu, J Wu, Z Gao, L Nie - … of the 44th International ACM SIGIR …, 2021 - dl.acm.org

Image-text retrieval is a fundamental and crucial branch in information retrieval. Although
much progress has been made in bridging vision and language, it remains challenging …

被引用次数：153 相关文章所有 3 个版本

[PDF] arxiv.org

Remote sensing cross-modal text-image retrieval based on global and local information

Z Yuan, W Zhang, C Tian, X Rong… - … on Geoscience and …, 2022 - ieeexplore.ieee.org

Cross-modal remote sensing text-image retrieval (RSCTIR) has recently become an urgent
research hotspot due to its ability of enabling fast and flexible information extraction on …

被引用次数：116 相关文章所有 3 个版本

Aggregation-based graph convolutional hashing for unsupervised cross-modal retrieval

PF Zhang, Y Li, Z Huang, XS Xu - IEEE Transactions on …, 2021 - ieeexplore.ieee.org

Cross-modal hashing has sparked much attention in large-scale information retrieval for its
storage and query efficiency. Despite the great success achieved by supervised …

被引用次数：132 相关文章所有 4 个版本

[PDF] thecvf.com

Learning cross-modal retrieval with noisy labels

P Hu, X Peng, H Zhu, L Zhen… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com

Recently, cross-modal retrieval is emerging with the help of deep multimodal learning.
However, even for unimodal data, collecting large-scale well-annotated data is expensive …

被引用次数：103 相关文章所有 7 个版本

Robust multi-view clustering with noisy correspondence

Y Sun, Y Qin, Y Li, D Peng, X Peng… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Deep multi-view clustering leverages deep neural networks to achieve promising
performance, but almost all existing methods implicitly assume that all views are aligned …

被引用次数：14 相关文章所有 3 个版本

[PDF] github.io

Deep multimodal transfer learning for cross-modal retrieval

L Zhen, P Hu, X Peng, RSM Goh… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org

Cross-modal retrieval (CMR) enables flexible retrieval experience across different
modalities (eg, texts versus images), which maximally benefits us from the abundance of …

被引用次数：78 相关文章所有 5 个版本

[PDF] thecvf.com

Multi-modality associative bridging through memory: Speech sound recollected from face video

M Kim, J Hong, SJ Park, YM Ro - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com

In this paper, we introduce a novel audio-visual multi-modal bridging framework that can
utilize both audio and visual information, even with uni-modal inputs. We exploit a memory …

被引用次数：47 相关文章所有 8 个版本