Turn tap: Temporal unit regression network for temporal action proposals

E Vahdani, Y Tian - IEEE Transactions on Pattern Analysis and …, 2022 - ieeexplore.ieee.org

Understanding human behavior and activity facilitates advancement of numerous real-world
applications, and is critical for video analysis. Despite the progress of action recognition …

被引用次数：59 相关文章所有 8 个版本

[PDF] thecvf.com

Vid2seq: Large-scale pretraining of a visual language model for dense video captioning

A Yang, A Nagrani, PH Seo, A Miech… - Proceedings of the …, 2023 - openaccess.thecvf.com

In this work, we introduce Vid2Seq, a multi-modal single-stage dense event captioning
model pretrained on narrated videos which are readily-available at scale. The Vid2Seq …

被引用次数：151 相关文章所有 26 个版本

TN-ZSTAD: Transferable network for zero-shot temporal activity detection

L Zhang, X Chang, J Liu, M Luo, Z Li… - … on Pattern Analysis …, 2022 - ieeexplore.ieee.org

An integral part of video analysis and surveillance is temporal activity detection, which
means to simultaneously recognize and localize activities in long untrimmed videos …

被引用次数：105 相关文章所有 6 个版本

[PDF] thecvf.com

Learning salient boundary feature for anchor-free temporal action localization

C Lin, C Xu, D Luo, Y Wang, Y Tai… - Proceedings of the …, 2021 - openaccess.thecvf.com

Temporal action localization is an important yet challenging task in video understanding.
Typically, such a task aims at inferring both the action category and localization of the start …

被引用次数：266 相关文章所有 5 个版本

[PDF] thecvf.com

End-to-end dense video captioning with parallel decoding

T Wang, R Zhang, Z Lu, F Zheng… - Proceedings of the …, 2021 - openaccess.thecvf.com

Dense video captioning aims to generate multiple associated captions with their temporal
locations from the video. Previous methods follow a sophisticated" localize-then-describe" …

被引用次数：170 相关文章所有 6 个版本

[PDF] arxiv.org

End-to-end temporal action detection with transformer

X Liu, Q Wang, Y Hu, X Tang, S Zhang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Temporal action detection (TAD) aims to determine the semantic label and the temporal
interval of every action instance in an untrimmed video. It is a fundamental and challenging …

被引用次数：198 相关文章所有 5 个版本

[PDF] thecvf.com

Bmn: Boundary-matching network for temporal action proposal generation

T Lin, X Liu, X Li, E Ding, S Wen - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com

Temporal action proposal generation is an challenging and promising task which aims to
locate temporal regions in real-world videos where action or event may occur. Current …

被引用次数：702 相关文章所有 5 个版本

[PDF] thecvf.com

G-tad: Sub-graph localization for temporal action detection

M Xu, C Zhao, DS Rojas, A Thabet… - Proceedings of the …, 2020 - openaccess.thecvf.com

Temporal action detection is a fundamental yet challenging task in video understanding.
Video context is a critical cue to effectively detect actions, but current works mainly focus on …

被引用次数：517 相关文章所有 11 个版本

[PDF] thecvf.com

Graph convolutional networks for temporal action localization

R Zeng, W Huang, M Tan, Y Rong… - Proceedings of the …, 2019 - openaccess.thecvf.com

Most state-of-the-art action localization systems process each action proposal individually,
without explicitly exploiting their relations during learning. However, the relations between …

被引用次数：569 相关文章所有 8 个版本

[PDF] thecvf.com

Relaxed transformer decoders for direct action proposal generation

J Tan, J Tang, L Wang, G Wu - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com

Temporal action proposal generation is an important and challenging task in video
understanding, which aims at detecting all temporal segments containing action instances of …

被引用次数：191 相关文章所有 6 个版本