Deep learning-based human pose estimation: A survey

C Zheng, W Wu, C Chen, T Yang, S Zhu, J Shen… - ACM Computing …, 2023 - dl.acm.org
Human pose estimation aims to locate the human body parts and build human body
representation (eg, body skeleton) from input data such as images and videos. It has drawn …

[HTML][HTML] Deep 3D human pose estimation: A review

J Wang, S Tan, X Zhen, S Xu, F Zheng, Z He… - Computer Vision and …, 2021 - Elsevier
Abstract Three-dimensional (3D) human pose estimation involves estimating the articulated
3D joint locations of a human body from an image or video. Due to its widespread …

Alphapose: Whole-body regional multi-person pose estimation and tracking in real-time

HS Fang, J Li, H Tang, C Xu, H Zhu… - … on Pattern Analysis …, 2022 - ieeexplore.ieee.org
Accurate whole-body multi-person pose estimation and tracking is an important yet
challenging topic in computer vision. To capture the subtle actions of humans for complex …

Sequential modeling enables scalable learning for large vision models

Y Bai, X Geng, K Mangalam, A Bar… - Proceedings of the …, 2024 - openaccess.thecvf.com
We introduce a novel sequential modeling approach which enables learning a Large Vision
Model (LVM) without making use of any linguistic data. To do this we define a common …

Humans in 4D: Reconstructing and tracking humans with transformers

S Goel, G Pavlakos, J Rajasegaran… - Proceedings of the …, 2023 - openaccess.thecvf.com
We present an approach to reconstruct humans and track them over time. At the core of our
approach, we propose a fully" transformerized" version of a network for human mesh …

Vitpose: Simple vision transformer baselines for human pose estimation

Y Xu, J Zhang, Q Zhang, D Tao - Advances in Neural …, 2022 - proceedings.neurips.cc
Although no specific domain knowledge is considered in the design, plain vision
transformers have shown excellent performance in visual recognition tasks. However, little …

Instructdiffusion: A generalist modeling interface for vision tasks

Z Geng, B Yang, T Hang, C Li, S Gu… - Proceedings of the …, 2024 - openaccess.thecvf.com
We present InstructDiffusion a unified and generic framework for aligning computer vision
tasks with human instructions. Unlike existing approaches that integrate prior knowledge …

Mvimgnet: A large-scale dataset of multi-view images

X Yu, M Xu, Y Zhang, H Liu, C Ye… - Proceedings of the …, 2023 - openaccess.thecvf.com
Being data-driven is one of the most iconic properties of deep learning algorithms. The birth
of ImageNet drives a remarkable trend of" learning from large-scale data" in computer vision …

Bedlam: A synthetic dataset of bodies exhibiting detailed lifelike animated motion

MJ Black, P Patel, J Tesch… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
We show, for the first time, that neural networks trained only on synthetic data achieve state-
of-the-art accuracy on the problem of 3D human pose and shape (HPS) estimation from real …

Cliff: Carrying location information in full frames into human pose and shape estimation

Z Li, J Liu, Z Zhang, S Xu, Y Yan - European Conference on Computer …, 2022 - Springer
Top-down methods dominate the field of 3D human pose and shape estimation, because
they are decoupled from human detection and allow researchers to focus on the core …