A review of multimodal human activity recognition with special emphasis on classification, applications, challenges and future directions
Human activity recognition (HAR) is one of the most important and challenging problems in
the computer vision. It has critical application in wide variety of tasks including gaming …
the computer vision. It has critical application in wide variety of tasks including gaming …
[HTML][HTML] A comprehensive survey of vision-based human action recognition methods
Although widely used in many applications, accurate and efficient human action recognition
remains a challenging area of research in the field of computer vision. Most recent surveys …
remains a challenging area of research in the field of computer vision. Most recent surveys …
A comprehensive study of deep video action recognition
Video action recognition is one of the representative tasks for video understanding. Over the
last decade, we have witnessed great advancements in video action recognition thanks to …
last decade, we have witnessed great advancements in video action recognition thanks to …
Videos as space-time region graphs
How do humans recognize the action" opening a book"? We argue that there are two
important cues: modeling temporal shape dynamics and modeling functional relationships …
important cues: modeling temporal shape dynamics and modeling functional relationships …
Temporal action localization in untrimmed videos via multi-stage cnns
We address temporal action localization in untrimmed long videos. This is important
because videos in real applications are usually unconstrained and contain multiple action …
because videos in real applications are usually unconstrained and contain multiple action …
Long-term temporal convolutions for action recognition
Typical human actions last several seconds and exhibit characteristic spatio-temporal
structure. Recent methods attempt to capture this structure and learn action representations …
structure. Recent methods attempt to capture this structure and learn action representations …
Unsupervised learning of video representations using lstms
N Srivastava, E Mansimov… - … on machine learning, 2015 - proceedings.mlr.press
Abstract We use Long Short Term Memory (LSTM) networks to learn representations of
video sequences. Our model uses an encoder LSTM to map an input sequence into a fixed …
video sequences. Our model uses an encoder LSTM to map an input sequence into a fixed …
Going deeper into action recognition: A survey
Understanding human actions in visual data is tied to advances in complementary research
areas including object recognition, human dynamics, domain adaptation and semantic …
areas including object recognition, human dynamics, domain adaptation and semantic …
Learning spatiotemporal features with 3d convolutional networks
We propose a simple, yet effective approach for spatiotemporal feature learning using deep
3-dimensional convolutional networks (3D ConvNets) trained on a large scale supervised …
3-dimensional convolutional networks (3D ConvNets) trained on a large scale supervised …
Delving deeper into convolutional networks for learning video representations
We propose an approach to learn spatio-temporal features in videos from intermediate
visual representations we call" percepts" using Gated-Recurrent-Unit Recurrent Networks …
visual representations we call" percepts" using Gated-Recurrent-Unit Recurrent Networks …