Action recognition based on RGB and skeleton data sets: A survey

R Yue, Z Tian, S Du - Neurocomputing, 2022 - Elsevier
Action recognition is a major branch of computer vision research. As a widely used
technology, action recognition has been applied to human–computer interaction, intelligent …

Sequential modeling enables scalable learning for large vision models

Y Bai, X Geng, K Mangalam, A Bar… - Proceedings of the …, 2024 - openaccess.thecvf.com
We introduce a novel sequential modeling approach which enables learning a Large Vision
Model (LVM) without making use of any linguistic data. To do this we define a common …

Internvideo: General video foundation models via generative and discriminative learning

Y Wang, K Li, Y Li, Y He, B Huang, Z Zhao… - arXiv preprint arXiv …, 2022 - arxiv.org
The foundation models have recently shown excellent performance on a variety of
downstream tasks in computer vision. However, most existing vision foundation models …

Prompting visual-language models for efficient video understanding

C Ju, T Han, K Zheng, Y Zhang, W Xie - European Conference on …, 2022 - Springer
Image-based visual-language (I-VL) pre-training has shown great success for learning joint
visual-textual representations from large-scale web data, revealing remarkable ability for …

Evidential deep learning for open set action recognition

W Bao, Q Yu, Y Kong - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com
In a real-world scenario, human actions are typically out of the distribution from training data,
which requires a model to both recognize the known actions and reject the unknown …

Human-to-robot imitation in the wild

S Bahl, A Gupta, D Pathak - arXiv preprint arXiv:2207.09450, 2022 - arxiv.org
We approach the problem of learning by watching humans in the wild. While traditional
approaches in Imitation and Reinforcement Learning are promising for learning in the real …

A comprehensive study of deep video action recognition

Y Zhu, X Li, C Liu, M Zolfaghari, Y Xiong, C Wu… - arXiv preprint arXiv …, 2020 - arxiv.org
Video action recognition is one of the representative tasks for video understanding. Over the
last decade, we have witnessed great advancements in video action recognition thanks to …

Soccernet-v2: A dataset and benchmarks for holistic understanding of broadcast soccer videos

A Deliege, A Cioppa, S Giancola… - Proceedings of the …, 2021 - openaccess.thecvf.com
Understanding broadcast videos is a challenging task in computer vision, as it requires
generic reasoning capabilities to appreciate the content offered by the video editing. In this …

Ar-net: Adaptive frame resolution for efficient action recognition

Y Meng, CC Lin, R Panda, P Sattigeri… - Computer Vision–ECCV …, 2020 - Springer
Action recognition is an open and challenging problem in computer vision. While current
state-of-the-art models offer excellent recognition results, their computational expense limits …

DeepEthogram, a machine learning pipeline for supervised behavior classification from raw pixels

JP Bohnslav, NK Wimalasena, KJ Clausing, YY Dai… - Elife, 2021 - elifesciences.org
Videos of animal behavior are used to quantify researcher-defined behaviors of interest to
study neural function, gene mutations, and pharmacological therapies. Behaviors of interest …