作者
Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, Luc Van Gool
发表日期
2018/9/2
期刊
IEEE transactions on pattern analysis and machine intelligence
卷号
41
期号
11
页码范围
2740-2755
出版商
IEEE
简介
We present a general and flexible video-level framework for learning action models in videos. This method, called temporal segment network (TSN), aims to model long-range temporal structure with a new segment-based sampling and aggregation scheme. This unique design enables the TSN framework to efficiently learn action models by using the whole video. The learned models could be easily deployed for action recognition in both trimmed and untrimmed videos with simple average pooling and multi-scale temporal window integration, respectively. We also study a series of good practices for the implementation of the TSN framework given limited training samples. Our approach obtains the state-the-of-art performance on five challenging action recognition benchmarks: HMDB51 (71.0 percent), UCF101 (94.9 percent), THUMOS14 (80.1 percent), ActivityNet v1.2 (89.6 percent), and Kinetics400 (75.7 percent …
引用总数
201720182019202020212022202320243196092162184214114
学术搜索中的文章
L Wang, Y Xiong, Z Wang, Y Qiao, D Lin, X Tang… - IEEE transactions on pattern analysis and machine …, 2018