Dual-Modal Attention-Enhanced Text-Video Retrieval with Triplet Partial Margin Contrastive Learning

C Jiang, H Liu, X Yu, Q Wang, Y Cheng, J Xu… - Proceedings of the 31st …, 2023 - dl.acm.org
In recent years, the explosion of web videos makes text-video retrieval increasingly essential
and popular for video filtering, recommendation, and search. Text-video retrieval aims to …

Learning linguistic association towards efficient text-video retrieval

S Fang, S Wang, J Zhuo, X Han, Q Huang - European Conference on …, 2022 - Springer
Text-video retrieval attracts growing attention recently. A dominant approach is to learn a
common space for aligning two modalities. However, video deliver richer content than text in …

Synthesizing Videos from Images for Image-to-Video Adaptation

J Zhuo, X Zhao, S Wang, H Ma, Q Huang - Proceedings of the 31st ACM …, 2023 - dl.acm.org
We address the image-to-video adaptation task that aims to leverage labeled images and
unlabeled videos for video recognition. There are two major challenges in this task …

A simple yet effective knowledge guided method for entity-aware video captioning on a basketball benchmark

Z Xi, G Shi, X Li, J Yan, Z Li, L Wu, Z Liu, L Wang - Neurocomputing, 2024 - Elsevier
Despite the recent emergence of video captioning models, how to generate the text
description with specific entity names and fine-grained actions is far from being solved …

Unsupervised Image-to-Video Adaptation via Category-aware Flow Memory Bank and Realistic Video Generation

K Huang, J Zhuo, S Wang, C Su, Q Huang… - Proceedings of the 32nd …, 2024 - dl.acm.org
Image-to-Video adaptation is proposed to train a model using labeled images and
unlabeled videos to facilitate the classification of unlabeled videos. The latest work …

Self-expressive induced clustered attention for video-text retrieval

J Zhu, X Shen, S Mehta, TA Abeo, Y Zhan - Multimedia Systems, 2024 - Springer
Extensive research has proven that self-attention achieves impressive performance in video-
text retrieval. However, most state-of-the-art methods neglect the intrinsic redundancy in …

Knowledge Graph Supported Benchmark and Video Captioning for Basketball

Z Xi, G Shi, L Wu, X Li, J Yan, L Wang, Z Liu - arXiv preprint arXiv …, 2024 - arxiv.org
Despite the recent emergence of video captioning models, how to generate the text
description with specific entity names and fine-grained actions is far from being solved …

Structured Encoding Based on Semantic Disambiguation for Video Captioning

B Sun, J Tian, Y Wu, L Yu, Y Tang - Cognitive Computation, 2024 - Springer
Video captioning, which aims to automatically generate video captions, has gained
significant attention due to its wide range of applications in video surveillance and retrieval …