Temporal 3d convnets: New architecture and transfer learning for video classification
… our transfer learning: 2D → 3D ConvNet performance with generic state-of-the-art 3D CNN
… weight initialization via transfer learning is possible for 3D ConvNet architecture, which can …
… weight initialization via transfer learning is possible for 3D ConvNet architecture, which can …
Learning spatiotemporal features with 3d convolutional networks
… fully-connected layers which perform well on transfer learning tasks [47… propose to learn
spatio-temporal features using deep 3D ConvNet. We … for video classification. We also verify that …
spatio-temporal features using deep 3D ConvNet. We … for video classification. We also verify that …
Rethinking spatiotemporal feature learning: Speed-accuracy trade-offs in video classification
… replacing 3D convolutions with spatial and temporal separable 3D convolutions, … video
classification datasets Next we conduct transfer learning experiments from Kinetics to other video …
classification datasets Next we conduct transfer learning experiments from Kinetics to other video …
Convnet architecture search for spatiotemporal feature learning
… architecture search for video classification on UCF101. We showed that our observations are
useful for spatiotemporal feature learning … be matched to a temporal slice of a 3D filter. We …
useful for spatiotemporal feature learning … be matched to a temporal slice of a 3D filter. We …
[PDF][PDF] Rethinking spatiotemporal feature learning for video understanding
… I3D-K, we retain 3D temporal convolutions at the lowest K layers … significantly on video
classification and action detection tasks. … transfer learning experiments from Kinetics to other video …
classification and action detection tasks. … transfer learning experiments from Kinetics to other video …
Two-stream 3-d convnet fusion for action recognition in videos with arbitrary size and length
… , we decompose a video into spatial and temporal shots. By … 3D convNet, which can recognize
human actions in videos of … size and diversity for video classification. In terms of HMDB51 …
human actions in videos of … size and diversity for video classification. In terms of HMDB51 …
Spatio-temporal deformable 3d convnets with attention for action recognition
… motion information, we propose the temporal deformable 3D ConvNets based on the attention.
The temporal deformable convolution can learn both the temporal and the spatial offsets …
The temporal deformable convolution can learn both the temporal and the spatial offsets …
3D convolutional networks with multi-layer-pooling selection fusion for video classification
Z Hu, R Zhang, Y Qiu, M Zhao, Z Sun - Multimedia Tools and Applications, 2021 - Springer
… , for the video representation building based on 3D ConvNets, … spatial and temporal jittering
and different video sample rate. … to learn robust video representation for video classification. …
and different video sample rate. … to learn robust video representation for video classification. …
Would mega-scale datasets further enhance spatiotemporal 3D CNNs?
… Following a comprehensive study of transfer learning on ImageNet [… the-art video classification
performance in the present paper. … the spatial and temporal volume at each stacked block. …
performance in the present paper. … the spatial and temporal volume at each stacked block. …
3D-TDC: A 3D temporal dilation convolution framework for video action recognition
… into the convolution network at one time, resulting in a limited temporal … 3D temporal dilation
convolution (3D-TDC) framework, to extract spatio-temporal features of actions from videos. …
convolution (3D-TDC) framework, to extract spatio-temporal features of actions from videos. …