Deep learning-based human pose estimation: A survey
Human pose estimation aims to locate the human body parts and build human body
representation (eg, body skeleton) from input data such as images and videos. It has drawn …
representation (eg, body skeleton) from input data such as images and videos. It has drawn …
[HTML][HTML] Deep 3D human pose estimation: A review
Abstract Three-dimensional (3D) human pose estimation involves estimating the articulated
3D joint locations of a human body from an image or video. Due to its widespread …
3D joint locations of a human body from an image or video. Due to its widespread …
Alphapose: Whole-body regional multi-person pose estimation and tracking in real-time
Accurate whole-body multi-person pose estimation and tracking is an important yet
challenging topic in computer vision. To capture the subtle actions of humans for complex …
challenging topic in computer vision. To capture the subtle actions of humans for complex …
Sequential modeling enables scalable learning for large vision models
We introduce a novel sequential modeling approach which enables learning a Large Vision
Model (LVM) without making use of any linguistic data. To do this we define a common …
Model (LVM) without making use of any linguistic data. To do this we define a common …
Humans in 4D: Reconstructing and tracking humans with transformers
We present an approach to reconstruct humans and track them over time. At the core of our
approach, we propose a fully" transformerized" version of a network for human mesh …
approach, we propose a fully" transformerized" version of a network for human mesh …
Vitpose: Simple vision transformer baselines for human pose estimation
Although no specific domain knowledge is considered in the design, plain vision
transformers have shown excellent performance in visual recognition tasks. However, little …
transformers have shown excellent performance in visual recognition tasks. However, little …
Instructdiffusion: A generalist modeling interface for vision tasks
We present InstructDiffusion a unified and generic framework for aligning computer vision
tasks with human instructions. Unlike existing approaches that integrate prior knowledge …
tasks with human instructions. Unlike existing approaches that integrate prior knowledge …
Mvimgnet: A large-scale dataset of multi-view images
Being data-driven is one of the most iconic properties of deep learning algorithms. The birth
of ImageNet drives a remarkable trend of" learning from large-scale data" in computer vision …
of ImageNet drives a remarkable trend of" learning from large-scale data" in computer vision …
Bedlam: A synthetic dataset of bodies exhibiting detailed lifelike animated motion
We show, for the first time, that neural networks trained only on synthetic data achieve state-
of-the-art accuracy on the problem of 3D human pose and shape (HPS) estimation from real …
of-the-art accuracy on the problem of 3D human pose and shape (HPS) estimation from real …
Cliff: Carrying location information in full frames into human pose and shape estimation
Top-down methods dominate the field of 3D human pose and shape estimation, because
they are decoupled from human detection and allow researchers to focus on the core …
they are decoupled from human detection and allow researchers to focus on the core …