From google gemini to openai q*(q-star): A survey of reshaping the generative artificial intelligence (ai) research landscape
This comprehensive survey explored the evolving landscape of generative Artificial
Intelligence (AI), with a specific focus on the transformative impacts of Mixture of Experts …
Intelligence (AI), with a specific focus on the transformative impacts of Mixture of Experts …
Comparative analysis on cross-modal information retrieval: A review
Human beings experience life through a spectrum of modes such as vision, taste, hearing,
smell, and touch. These multiple modes are integrated for information processing in our …
smell, and touch. These multiple modes are integrated for information processing in our …
Unsupervised contrastive cross-modal hashing
In this paper, we study how to make unsupervised cross-modal hashing (CMH) benefit from
contrastive learning (CL) by overcoming two challenges. To be exact, i) to address the …
contrastive learning (CL) by overcoming two challenges. To be exact, i) to address the …
Dynamic modality interaction modeling for image-text retrieval
Image-text retrieval is a fundamental and crucial branch in information retrieval. Although
much progress has been made in bridging vision and language, it remains challenging …
much progress has been made in bridging vision and language, it remains challenging …
Remote sensing cross-modal text-image retrieval based on global and local information
Cross-modal remote sensing text-image retrieval (RSCTIR) has recently become an urgent
research hotspot due to its ability of enabling fast and flexible information extraction on …
research hotspot due to its ability of enabling fast and flexible information extraction on …
Aggregation-based graph convolutional hashing for unsupervised cross-modal retrieval
Cross-modal hashing has sparked much attention in large-scale information retrieval for its
storage and query efficiency. Despite the great success achieved by supervised …
storage and query efficiency. Despite the great success achieved by supervised …
Learning cross-modal retrieval with noisy labels
Recently, cross-modal retrieval is emerging with the help of deep multimodal learning.
However, even for unimodal data, collecting large-scale well-annotated data is expensive …
However, even for unimodal data, collecting large-scale well-annotated data is expensive …
Robust multi-view clustering with noisy correspondence
Deep multi-view clustering leverages deep neural networks to achieve promising
performance, but almost all existing methods implicitly assume that all views are aligned …
performance, but almost all existing methods implicitly assume that all views are aligned …
Deep multimodal transfer learning for cross-modal retrieval
Cross-modal retrieval (CMR) enables flexible retrieval experience across different
modalities (eg, texts versus images), which maximally benefits us from the abundance of …
modalities (eg, texts versus images), which maximally benefits us from the abundance of …
Multi-modality associative bridging through memory: Speech sound recollected from face video
In this paper, we introduce a novel audio-visual multi-modal bridging framework that can
utilize both audio and visual information, even with uni-modal inputs. We exploit a memory …
utilize both audio and visual information, even with uni-modal inputs. We exploit a memory …