Video you only look once: Overall temporal convolutions for action recognition

L Jing, Y Tian - IEEE transactions on pattern analysis and …, 2020 - ieeexplore.ieee.org

Large-scale labeled data are generally required to train deep neural networks in order to
obtain better performance in visual feature learning from images or videos for computer …

被引用次数：2041 相关文章所有 7 个版本

[HTML] mdpi.com

[HTML][HTML] Video-based human activity recognition using deep learning approaches

GAS Surek, LO Seman, SF Stefenon, VC Mariani… - Sensors, 2023 - mdpi.com

Due to its capacity to gather vast, high-level data about human activity from wearable or
stationary sensors, human activity recognition substantially impacts people's day-to-day …

被引用次数：26 相关文章所有 14 个版本

[PDF] arxiv.org

Self-supervised spatiotemporal feature learning via video rotation prediction

L Jing, X Yang, J Liu, Y Tian - arXiv preprint arXiv:1811.11387, 2018 - arxiv.org

The success of deep neural networks generally requires a vast amount of training data to be
labeled, which is expensive and unfeasible in scale, especially for video collections. To …

被引用次数：140 相关文章所有 2 个版本

[PDF] researchgate.net

Ship target detection and identification based on SSD_MobilenetV2

Y Zou, L Zhao, S Qin, M Pan, Z Li - 2020 IEEE 5th Information …, 2020 - ieeexplore.ieee.org

There are many deep learning algorithms currently used in ship supervision, but they
generally have the problems of insufficient target detection speed and accurate identification …

被引用次数：60 相关文章所有 2 个版本

[HTML] sciencedirect.com

[HTML][HTML] Spatial self-attention network with self-attention distillation for fine-grained image recognition

AA Baffour, Z Qin, Y Wang, Z Qin, KKR Choo - Journal of Visual …, 2021 - Elsevier

The underlining task for fine-grained image recognition captures both the inter-class and
intra-class discriminate features. Existing methods generally use auxiliary data to guide the …

被引用次数：20 相关文章所有 3 个版本

[PDF] nsf.gov

Facs3d-net: 3d convolution based spatiotemporal representation for action unit detection

L Yang, IO Ertugrul, JF Cohn, Z Hammal… - 2019 8th …, 2019 - ieeexplore.ieee.org

Most approaches to automatic facial action unit (AU) detection consider only spatial
information and ignore AU dynamics. For humans, dynamics improves AU perception. Is …

被引用次数：39 相关文章所有 9 个版本

3D deformable convolution temporal reasoning network for action recognition

Y Ou, Z Chen - Journal of Visual Communication and Image …, 2023 - Elsevier

Modeling and reasoning of the interactions between multiple entities (actors and objects)
are beneficial for the action recognition task. In this paper, we propose a 3D Deformable …

被引用次数：4 相关文章所有 2 个版本

[PDF] arxiv.org

Recognizing american sign language manual signs from rgb-d videos

L Jing, E Vahdani, M Huenerfauth, Y Tian - arXiv preprint arXiv …, 2019 - arxiv.org

In this paper, we propose a 3D Convolutional Neural Network (3DCNN) based multi-stream
framework to recognize American Sign Language (ASL) manual signs (consisting of …

被引用次数：29 相关文章所有 3 个版本

[PDF] researchgate.net

Analysis of pruned neural networks (MobileNetV2-YOLO v2) for underwater object detection

AF Ayob, K Khairuddin, YM Mustafah, AR Salisa… - Proceedings of the 11th …, 2021 - Springer

Underwater object detection involves the activity of multiple object identification within a
dynamic and noisy environment. Such task is challenging due to the inconsistency of …

被引用次数：20 相关文章所有 4 个版本

[PDF] wiley.com Full View

G‐YOLOX: A Lightweight Network for Detecting Vehicle Types

Q Luo, J Wang, M Gao, H Lin, H Zhou… - Journal of …, 2022 - Wiley Online Library

In recent years, vehicle type detection has had an important role in traffic management. A
lightweight detection network based on multiscale ghost convolution called G‐YOLOX is …

被引用次数：7 相关文章所有 5 个版本