Temporal sentence grounding in videos: A survey and future directions

H Zhang, A Sun, W Jing, JT Zhou - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Temporal sentence grounding in videos (TSGV), aka, natural language video localization
(NLVL) or video moment retrieval (VMR), aims to retrieve a temporal moment that …

A survey on temporal sentence grounding in videos

X Lan, Y Yuan, X Wang, Z Wang, W Zhu - ACM Transactions on …, 2023 - dl.acm.org
Temporal sentence grounding in videos (TSGV), which aims at localizing one target
segment from an untrimmed video with respect to a given sentence query, has drawn …

Not all inputs are valid: Towards open-set video moment retrieval using language

X Fang, W Fang, D Liu, X Qu, J Dong, P Zhou… - Proceedings of the …, 2024 - dl.acm.org
Video Moment Retrieval (VMR) targets to retrieve the specific moment corresponding to a
sentence query from an untrimmed video. Although recent respectable works have made …

Prompt-based zero-shot video moment retrieval

G Wang, X Wu, Z Liu, J Yan - Proceedings of the 30th ACM International …, 2022 - dl.acm.org
Video moment retrieval aims at localizing a specific moment from an untrimmed video by a
sentence query. Most methods rely on heavy annotations of video moment-query pairs …

Sdn: Semantic decoupling network for temporal language grounding

X Jiang, X Xu, J Zhang, F Shen, Z Cao… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Temporal language grounding (TLG) is one of the most challenging cross-modal video
understanding tasks, which aims at retrieving the most relevant video segment from an …

[PDF][PDF] The elements of temporal sentence grounding in videos: A survey and future directions

H Zhang, A Sun, W Jing, JT Zhou - arXiv preprint arXiv …, 2022 - researchgate.net
Temporal sentence grounding in videos (TSGV), aka, natural language video localization
(NLVL) or video moment retrieval (VMR), aims to retrieve a temporal moment that …

Mrtnet: Multi-resolution temporal network for video sentence grounding

W Ji, Y Qin, L Chen, Y Wei, Y Wu… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
Video sentence grounding locates a specific moment in a video based on a text query.
Existing methods focus on single temporal resolution, ignoring multi-scale temporal …

Video moment retrieval with hierarchical contrastive learning

B Zhang, C Yang, B Jiang, X Zhou - Proceedings of the 30th ACM …, 2022 - dl.acm.org
This paper explores the task of video moment retrieval (VMR), which aims to localize the
temporal boundary of a specific moment from an untrimmed video by a sentence query …

DPHANet: Discriminative Parallel and Hierarchical Attention Network for Natural Language Video Localization

R Chen, J Tan, Z Yang, X Yang, Q Dai… - IEEE Transactions …, 2024 - ieeexplore.ieee.org
Natural Language Video Localization (NLVL) has recently attracted much attention because
of its practical significance. However, the existing methods still face the following challenges …

Unsupervised Video Moment Retrieval with Knowledge-Based Pseudo-Supervision Construction

G Wang, X Wu, X Tu, Z Liu, J Yan - ACM Transactions on Information …, 2024 - dl.acm.org
Video moment retrieval locates a specified moment by a sentence query. Recent
approaches have made remarkable advancements with large-scale video-sentence …