Temporal sentence grounding in videos: A survey and future directions
Temporal sentence grounding in videos (TSGV), aka, natural language video localization
(NLVL) or video moment retrieval (VMR), aims to retrieve a temporal moment that …
(NLVL) or video moment retrieval (VMR), aims to retrieve a temporal moment that …
A survey on temporal sentence grounding in videos
Temporal sentence grounding in videos (TSGV), which aims at localizing one target
segment from an untrimmed video with respect to a given sentence query, has drawn …
segment from an untrimmed video with respect to a given sentence query, has drawn …
Not all inputs are valid: Towards open-set video moment retrieval using language
Video Moment Retrieval (VMR) targets to retrieve the specific moment corresponding to a
sentence query from an untrimmed video. Although recent respectable works have made …
sentence query from an untrimmed video. Although recent respectable works have made …
Prompt-based zero-shot video moment retrieval
Video moment retrieval aims at localizing a specific moment from an untrimmed video by a
sentence query. Most methods rely on heavy annotations of video moment-query pairs …
sentence query. Most methods rely on heavy annotations of video moment-query pairs …
Sdn: Semantic decoupling network for temporal language grounding
Temporal language grounding (TLG) is one of the most challenging cross-modal video
understanding tasks, which aims at retrieving the most relevant video segment from an …
understanding tasks, which aims at retrieving the most relevant video segment from an …
[PDF][PDF] The elements of temporal sentence grounding in videos: A survey and future directions
Temporal sentence grounding in videos (TSGV), aka, natural language video localization
(NLVL) or video moment retrieval (VMR), aims to retrieve a temporal moment that …
(NLVL) or video moment retrieval (VMR), aims to retrieve a temporal moment that …
Mrtnet: Multi-resolution temporal network for video sentence grounding
Video sentence grounding locates a specific moment in a video based on a text query.
Existing methods focus on single temporal resolution, ignoring multi-scale temporal …
Existing methods focus on single temporal resolution, ignoring multi-scale temporal …
Video moment retrieval with hierarchical contrastive learning
This paper explores the task of video moment retrieval (VMR), which aims to localize the
temporal boundary of a specific moment from an untrimmed video by a sentence query …
temporal boundary of a specific moment from an untrimmed video by a sentence query …
DPHANet: Discriminative Parallel and Hierarchical Attention Network for Natural Language Video Localization
R Chen, J Tan, Z Yang, X Yang, Q Dai… - IEEE Transactions …, 2024 - ieeexplore.ieee.org
Natural Language Video Localization (NLVL) has recently attracted much attention because
of its practical significance. However, the existing methods still face the following challenges …
of its practical significance. However, the existing methods still face the following challenges …
Unsupervised Video Moment Retrieval with Knowledge-Based Pseudo-Supervision Construction
Video moment retrieval locates a specified moment by a sentence query. Recent
approaches have made remarkable advancements with large-scale video-sentence …
approaches have made remarkable advancements with large-scale video-sentence …