Proposal-free temporal moment localization of a natural-language query in video using guided...

M Liu, L Nie, Y Wang, M Wang, Y Rui - ACM Computing Surveys, 2023 - dl.acm.org

Video moment localization, also known as video moment retrieval, aims to search a target
segment within a video described by a given natural language query. Beyond the task of …

被引用次数：30 相关文章所有 4 个版本

[PDF] arxiv.org

Temporal sentence grounding in videos: A survey and future directions

H Zhang, A Sun, W Jing, JT Zhou - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Temporal sentence grounding in videos (TSGV), aka, natural language video localization
(NLVL) or video moment retrieval (VMR), aims to retrieve a temporal moment that …

被引用次数：46 相关文章所有 8 个版本

[PDF] thecvf.com

Tubedetr: Spatio-temporal video grounding with transformers

A Yang, A Miech, J Sivic, I Laptev… - Proceedings of the …, 2022 - openaccess.thecvf.com

We consider the problem of localizing a spatio-temporal tube in a video corresponding to a
given text query. This is a challenging task that requires the joint and efficient modeling of …

被引用次数：104 相关文章所有 10 个版本

[PDF] thecvf.com

Context-aware biaffine localizing network for temporal sentence grounding

D Liu, X Qu, J Dong, P Zhou, Y Cheng… - Proceedings of the …, 2021 - openaccess.thecvf.com

This paper addresses the problem of temporal sentence grounding (TSG), which aims to
identify the temporal boundary of a specific segment from an untrimmed video by a sentence …

被引用次数：164 相关文章所有 6 个版本

[PDF] thecvf.com

Local-global video-text interactions for temporal grounding

J Mun, M Cho, B Han - … of the IEEE/CVF Conference on …, 2020 - openaccess.thecvf.com

This paper addresses the problem of text-to-video temporal grounding, which aims to
identify the time interval in a video semantically relevant to a text query. We tackle this …

被引用次数：301 相关文章所有 8 个版本

[PDF] aaai.org

Boundary proposal network for two-stage natural language video localization

S Xiao, L Chen, S Zhang, W Ji, J Shao, L Ye… - Proceedings of the AAAI …, 2021 - ojs.aaai.org

We aim to address the problem of Natural Language Video Localization (NLVL)—localizing
the video segment corresponding to a natural language description in a long and untrimmed …

被引用次数：176 相关文章所有 5 个版本

[PDF] aaai.org

Negative sample matters: A renaissance of metric learning for temporal grounding

Z Wang, L Wang, T Wu, T Li, G Wu - … of the AAAI Conference on Artificial …, 2022 - ojs.aaai.org

Temporal grounding aims to localize a video moment which is semantically aligned with a
given natural language query. Existing methods typically apply a detection or regression …

被引用次数：125 相关文章所有 5 个版本

[PDF] arxiv.org

Mindstorms in natural language-based societies of mind

M Zhuge, H Liu, F Faccio, DR Ashley… - arXiv preprint arXiv …, 2023 - arxiv.org

Both Minsky's" society of mind" and Schmidhuber's" learning to think" inspire diverse
societies of large multimodal neural networks (NNs) that solve problems by interviewing …

被引用次数：61 相关文章所有 7 个版本

[PDF] thecvf.com

Fast video moment retrieval

J Gao, C Xu - Proceedings of the IEEE/CVF International …, 2021 - openaccess.thecvf.com

This paper targets at fast video moment retrieval (fast VMR), aiming to localize the target
moment efficiently and accurately as queried by a given natural language sentence. We …

被引用次数：111 相关文章所有 4 个版本

[PDF] neurips.cc

Semantic conditioned dynamic modulation for temporal sentence grounding in videos

Y Yuan, L Ma, J Wang, W Liu… - Advances in Neural …, 2019 - proceedings.neurips.cc

Temporal sentence grounding in videos aims to detect and localize one target video
segment, which semantically corresponds to a given sentence. Existing methods mainly …

被引用次数：269 相关文章所有 12 个版本