Deep learning-based action detection in untrimmed videos: A survey

E Vahdani, Y Tian - IEEE Transactions on Pattern Analysis and …, 2022 - ieeexplore.ieee.org
Understanding human behavior and activity facilitates advancement of numerous real-world
applications, and is critical for video analysis. Despite the progress of action recognition …

Cola: Weakly-supervised temporal action localization with snippet contrastive learning

C Zhang, M Cao, D Yang, J Chen… - Proceedings of the …, 2021 - openaccess.thecvf.com
Weakly-supervised temporal action localization (WS-TAL) aims to localize actions in
untrimmed videos with only video-level labels. Most existing models follow the" localization …

Object-region video transformers

R Herzig, E Ben-Avraham… - Proceedings of the …, 2022 - openaccess.thecvf.com
Recently, video transformers have shown great success in video understanding, exceeding
CNN performance; yet existing video transformer models do not explicitly model objects …

Tsp: Temporally-sensitive pretraining of video encoders for localization tasks

H Alwassel, S Giancola… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Due to the large memory footprint of untrimmed videos, current state-of-the-art video
localization methods operate atop precomputed video clip features. These features are …

Learning action completeness from points for weakly-supervised temporal action localization

P Lee, H Byun - Proceedings of the IEEE/CVF international …, 2021 - openaccess.thecvf.com
We tackle the problem of localizing temporal intervals of actions with only a single frame
label for each action instance for training. Owing to label sparsity, existing work fails to learn …

Cross-modal consensus network for weakly supervised temporal action localization

FT Hong, JC Feng, D Xu, Y Shan… - Proceedings of the 29th …, 2021 - dl.acm.org
Weakly supervised temporal action localization (WS-TAL) is a challenging task that aims to
localize action instances in the given video with video-level categorical supervision …

Background-click supervision for temporal action localization

L Yang, J Han, T Zhao, T Lin, D Zhang… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Weakly supervised temporal action localization aims at learning the instance-level action
pattern from the video-level labels, where a significant challenge is action-context confusion …

Learning to refactor action and co-occurrence features for temporal action localization

K Xia, L Wang, S Zhou, N Zheng… - Proceedings of the …, 2022 - openaccess.thecvf.com
The main challenge of Temporal Action Localization is to retrieve subtle human actions from
various co-occurring ingredients, eg, context and background, in an untrimmed video. While …

Weakly-supervised temporal action localization by uncertainty modeling

P Lee, J Wang, Y Lu, H Byun - Proceedings of the AAAI conference on …, 2021 - ojs.aaai.org
Weakly-supervised temporal action localization aims to learn detecting temporal intervals of
action classes with only video-level labels. To this end, it is crucial to separate frames of …

Activity graph transformer for temporal action localization

M Nawhal, G Mori - arXiv preprint arXiv:2101.08540, 2021 - arxiv.org
We introduce Activity Graph Transformer, an end-to-end learnable model for temporal action
localization, that receives a video as input and directly predicts a set of action instances that …