Gaussian temporal awareness networks for action localization
Temporally localizing actions in a video is a fundamental challenge in video understanding.
Most existing approaches have often drawn inspiration from image object detection and …
Most existing approaches have often drawn inspiration from image object detection and …
Smart frame selection for action recognition
Video classification is computationally expensive. In this paper, we address theproblem of
frame selection to reduce the computational cost of video classification. Recent work has …
frame selection to reduce the computational cost of video classification. Recent work has …
Attentional pooling for action recognition
We introduce a simple yet surprisingly powerful model to incorporate attention in action
recognition and human object interaction tasks. Our proposed attention module can be …
recognition and human object interaction tasks. Our proposed attention module can be …
Representation flow for action recognition
AJ Piergiovanni, MS Ryoo - … of the IEEE/CVF conference on …, 2019 - openaccess.thecvf.com
In this paper, we propose a convolutional layer inspired by optical flow algorithms to learn
motion representations. Our representation flow layer is a fully-differentiable layer designed …
motion representations. Our representation flow layer is a fully-differentiable layer designed …
Spatio-temporal attention networks for action recognition and detection
Recently, 3D Convolutional Neural Network (3D CNN) models have been widely studied for
video sequences and achieved satisfying performance in action recognition and detection …
video sequences and achieved satisfying performance in action recognition and detection …
Lsta: Long short-term attention for egocentric action recognition
S Sudhakaran, S Escalera… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
Egocentric activity recognition is one of the most challenging tasks in video analysis. It
requires a fine-grained discrimination of small objects and their manipulation. While some …
requires a fine-grained discrimination of small objects and their manipulation. While some …
T-C3D: Temporal convolutional 3D network for real-time action recognition
Video-based action recognition with deep neural networks has shown remarkable progress.
However, most of the existing approaches are too computationally expensive due to the …
However, most of the existing approaches are too computationally expensive due to the …
The pros and cons: Rank-aware temporal attention for skill determination in long videos
H Doughty, W Mayol-Cuevas… - Proceedings of the …, 2019 - openaccess.thecvf.com
We present a new model to determine relative skill from long videos, through learnable
temporal attention modules. Skill determination is formulated as a ranking problem, making …
temporal attention modules. Skill determination is formulated as a ranking problem, making …
Temporal gaussian mixture layer for videos
AJ Piergiovanni, M Ryoo - International Conference on …, 2019 - proceedings.mlr.press
We introduce a new convolutional layer named the Temporal Gaussian Mixture (TGM) layer
and present how it can be used to efficiently capture longer-term temporal information in …
and present how it can be used to efficiently capture longer-term temporal information in …
Learning latent super-events to detect multiple activities in videos
AJ Piergiovanni, MS Ryoo - Proceedings of the IEEE …, 2018 - openaccess.thecvf.com
In this paper, we introduce the concept of learning latent super-events from activity videos,
and present how it benefits activity detection in continuous videos. We define a super-event …
and present how it benefits activity detection in continuous videos. We define a super-event …