Prompting visual-language models for efficient video understanding

C Ju, T Han, K Zheng, Y Zhang, W Xie - European Conference on …, 2022 - Springer
Image-based visual-language (I-VL) pre-training has shown great success for learning joint
visual-textual representations from large-scale web data, revealing remarkable ability for …

Overview of temporal action detection based on deep learning

K Hu, C Shen, T Wang, K Xu, Q Xia, M Xia… - Artificial Intelligence …, 2024 - Springer
Abstract Temporal Action Detection (TAD) aims to accurately capture each action interval in
an untrimmed video and to understand human actions. This paper comprehensively surveys …

Self-feedback detr for temporal action detection

J Kim, M Lee, JP Heo - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Abstract Temporal Action Detection (TAD) is challenging but fundamental for real-world
video applications. Recently, DETR-based models have been devised for TAD but have not …

Difftad: Temporal action detection with proposal denoising diffusion

S Nag, X Zhu, J Deng, YZ Song… - Proceedings of the …, 2023 - openaccess.thecvf.com
We propose a new formulation of temporal action detection (TAD) with denoising diffusion,
DiffTAD in short. Taking as input random temporal proposals, it can yield action proposals …

Decomposed cross-modal distillation for rgb-based temporal action detection

P Lee, T Kim, M Shim, D Wee… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Temporal action detection aims to predict the time intervals and the classes of action
instances in the video. Despite the promising performance, existing two-stream models …

[HTML][HTML] Geostatistical modeling approach for studying total soil nitrogen and phosphorus under various land uses of North-Western Himalayas

O Bashir, SA Bangroo, SS Shafai, N Senesi… - Ecological …, 2024 - Elsevier
The distribution of total soil nitrogen (TSN) and total soil phosphorus (TSP) plays a pivotal
role in shaping soil quality, fertility, agricultural practices, and environmental balance …

Distilling vision-language pre-training to collaborate with weakly-supervised temporal action localization

C Ju, K Zheng, J Liu, P Zhao, Y Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Weakly-supervised temporal action localization (WTAL) learns to detect and classify action
instances with only category labels. Most methods widely adopt the off-the-shelf …

Hierarchical local-global transformer for temporal sentence grounding

X Fang, D Liu, P Zhou, Z Xu, R Li - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
This article studies the multimedia problem of temporal sentence grounding (TSG), which
aims to accurately determine the specific video segment in an untrimmed video according to …

Action sensitivity learning for temporal action localization

J Shao, X Wang, R Quan, J Zheng… - Proceedings of the …, 2023 - openaccess.thecvf.com
Temporal action localization (TAL), which involves recognizing and locating action
instances, is a challenging task in video understanding. Most existing approaches directly …

Learning from noisy pseudo labels for semi-supervised temporal action localization

K Xia, L Wang, S Zhou, G Hua… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Semi-Supervised Temporal Action Localization (SS-TAL) aims to improve the
generalization ability of action detectors with large-scale unlabeled videos. Albeit the recent …