Point-level temporal action localization: Bridging fully-supervised proposals to weakly-supervise...

E Vahdani, Y Tian - IEEE Transactions on Pattern Analysis and …, 2022 - ieeexplore.ieee.org

Understanding human behavior and activity facilitates advancement of numerous real-world
applications, and is critical for video analysis. Despite the progress of action recognition …

被引用次数：55 相关文章所有 8 个版本

[HTML] cjig.cn

[HTML][HTML] 视觉弱监督学习研究进展

任冬伟，王旗龙，魏云超，孟德宇，左旺孟 - 2022 - cjig.cn

摘要视觉理解, 如物体检测, 语义和实例分割以及动作识别等, 在人机交互和自动驾驶等领域中
有着广泛的应用并发挥着至关重要的作用. 近年来, 基于全监督学习的深度视觉理解网络取得了 …

被引用次数：7 相关文章所有 4 个版本

[PDF] thecvf.com

Learning action completeness from points for weakly-supervised temporal action localization

P Lee, H Byun - Proceedings of the IEEE/CVF international …, 2021 - openaccess.thecvf.com

We tackle the problem of localizing temporal intervals of actions with only a single frame
label for each action instance for training. Owing to label sparsity, existing work fails to learn …

被引用次数：73 相关文章所有 8 个版本

[PDF] neurips.cc

Open-vocabulary semantic segmentation via attribute decomposition-aggregation

C Ma, Y Yuhuan, C Ju, F Zhang… - Advances in Neural …, 2024 - proceedings.neurips.cc

Open-vocabulary semantic segmentation is a challenging task that requires segmenting
novel object categories at inference time. Recent works explore vision-language pre-training …

被引用次数：9 相关文章所有 4 个版本

[PDF] thecvf.com

Two-stream networks for weakly-supervised temporal action localization with semantic-aware mechanisms

Y Wang, Y Li, H Wang - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com

Weakly-supervised temporal action localization aims to detect action boundaries in
untrimmed videos with only video-level annotations. Most existing schemes detect temporal …

被引用次数：12 相关文章所有 3 个版本

[PDF] thecvf.com

Distilling vision-language pre-training to collaborate with weakly-supervised temporal action localization

C Ju, K Zheng, J Liu, P Zhao, Y Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Weakly-supervised temporal action localization (WTAL) learns to detect and classify action
instances with only category labels. Most methods widely adopt the off-the-shelf …

被引用次数：17 相关文章所有 6 个版本

[PDF] thecvf.com

Audio-Visual Segmentation via Unlabeled Frame Exploitation

J Liu, Y Liu, F Zhang, C Ju… - Proceedings of the …, 2024 - openaccess.thecvf.com

Audio-visual segmentation (AVS) aims to segment the sounding objects in video frames.
Although great progress has been witnessed we experimentally reveal that current methods …

被引用次数：2 相关文章所有 4 个版本

[PDF] arxiv.org

Multi-modal prompting for low-shot temporal action localization

C Ju, Z Li, P Zhao, Y Zhang, X Zhang, Q Tian… - arXiv preprint arXiv …, 2023 - arxiv.org

In this paper, we consider the problem of temporal action localization under low-shot (zero-
shot & few-shot) scenario, with the goal of detecting and classifying the action instances from …

被引用次数：13 相关文章所有 2 个版本

Compact representation and reliable classification learning for point-level weakly-supervised action localization

J Fu, J Gao, C Xu - IEEE Transactions on Image Processing, 2022 - ieeexplore.ieee.org

Point-level weakly-supervised temporal action localization (P-WSTAL) aims to localize
temporal extents of action instances and identify the corresponding categories with only a …

被引用次数：11 相关文章所有 4 个版本

[PDF] arxiv.org

Constraint and union for partially-supervised temporal sentence grounding

C Ju, H Wang, J Liu, C Ma, Y Zhang, P Zhao… - arXiv preprint arXiv …, 2023 - arxiv.org

Temporal sentence grounding aims to detect the event timestamps described by the natural
language query from given untrimmed videos. The existing fully-supervised setting achieves …

被引用次数：12 相关文章所有 2 个版本