Visual context learning based on textual knowledge for image–text retrieval

Z Ji, Z Li, Y Zhang, H Wang, Y Pang, X Li - Neural Networks, 2024 - Elsevier

As a promising field, Multi-Query Image Retrieval (MQIR) aims at searching for the
semantically relevant image given multiple region-specific text queries. Existing works …

被引用次数：8 相关文章所有 6 个版本

Multi-modal long document classification based on Hierarchical Prompt and Multi-modal Transformer

T Liu, Y Hu, J Gao, J Wang, Y Sun, B Yin - Neural Networks, 2024 - Elsevier

In the realm of long document classification (LDC), previous research has predominantly
focused on modeling unimodal texts, overlooking the potential of multi-modal documents …

被引用次数：1 相关文章所有 3 个版本

Multi-level Symmetric Semantic Alignment Network for image-text matching

W Wang, X Di, M Liu, F Gao - Neurocomputing, 2024 - Elsevier

Image-text matching has attracted much attention as one of the visual-linguistic tasks. Most
of the existing methods tend to concentrate on single-level semantic similarity by global …

A unified multiple inducible co-attentions and edge guidance network for co-saliency detection

Z Tan, X Gu - International Conference on Artificial Neural Networks, 2022 - Springer

The learning-based methods have improved the performances of co-salient object detection
(CoSOD). Mining the intra-image saliency individuals and exploring the inter-image co …

被引用次数：1 相关文章所有 2 个版本