查看文章

cv-foundation.org 中的 [PDF]

Detecting events and key actors in multi-person videos

作者

Vignesh Ramanathan, Jonathan Huang, Sami Abu-El-Haija, Alexander Gorban, Kevin Murphy, Li Fei-Fei

发表日期

2016

研讨会论文

Proceedings of the IEEE conference on computer vision and pattern recognition

页码范围

3043-3053

简介

Multi-person event recognition is a challenging task, often with many people active in the scene but only a small subset contributing to an actual event. In this paper, we propose a model which learns to detect events in such videos while automatically" attending" to the people responsible for the event. Our model does not use explicit annotations regarding who or where those people are during training and testing. In particular, we track people in videos and use a recurrent neural network (RNN) to represent the track features. We learn time-varying attention weights to combine these features at each time-instant. The attended features are then processed using another RNN for event detection/classification. Since most video datasets with multiple people are restricted to a small number of videos, we also collected a new basketball dataset comprising 257 basketball games with 14K event annotations corresponding to 11 event classes. Our model outperforms state-of-the-art methods for both event classification and detection on this new dataset. Additionally, we show that the attention mechanism is able to consistently localize the relevant players.

引用总数

被引用次数：255

2016201720182019202020212022202320244 32 35 36 39 41 31 23 9

学术搜索中的文章

Detecting events and key actors in multi-person videos

V Ramanathan, J Huang, S Abu-El-Haija, A Gorban… - Proceedings of the IEEE conference on computer …, 2016

被引用次数：255 相关文章所有 17 个版本