2d human pose estimation: New benchmark and state of the art analysis

C Zheng, W Wu, C Chen, T Yang, S Zhu, J Shen… - ACM Computing …, 2023 - dl.acm.org

Human pose estimation aims to locate the human body parts and build human body
representation (eg, body skeleton) from input data such as images and videos. It has drawn …

被引用次数：535 相关文章所有 4 个版本

[HTML] sciencedirect.com

[HTML][HTML] Deep 3D human pose estimation: A review

J Wang, S Tan, X Zhen, S Xu, F Zheng, Z He… - Computer Vision and …, 2021 - Elsevier

Abstract Three-dimensional (3D) human pose estimation involves estimating the articulated
3D joint locations of a human body from an image or video. Due to its widespread …

被引用次数：337 相关文章所有 5 个版本

[PDF] arxiv.org

Alphapose: Whole-body regional multi-person pose estimation and tracking in real-time

HS Fang, J Li, H Tang, C Xu, H Zhu… - … on Pattern Analysis …, 2022 - ieeexplore.ieee.org

Accurate whole-body multi-person pose estimation and tracking is an important yet
challenging topic in computer vision. To capture the subtle actions of humans for complex …

被引用次数：492 相关文章所有 8 个版本

[PDF] thecvf.com

Sequential modeling enables scalable learning for large vision models

Y Bai, X Geng, K Mangalam, A Bar… - Proceedings of the …, 2024 - openaccess.thecvf.com

We introduce a novel sequential modeling approach which enables learning a Large Vision
Model (LVM) without making use of any linguistic data. To do this we define a common …

被引用次数：125 相关文章所有 3 个版本

[PDF] thecvf.com

Humans in 4D: Reconstructing and tracking humans with transformers

S Goel, G Pavlakos, J Rajasegaran… - Proceedings of the …, 2023 - openaccess.thecvf.com

We present an approach to reconstruct humans and track them over time. At the core of our
approach, we propose a fully" transformerized" version of a network for human mesh …

被引用次数：163 相关文章所有 5 个版本

[PDF] neurips.cc

Vitpose: Simple vision transformer baselines for human pose estimation

Y Xu, J Zhang, Q Zhang, D Tao - Advances in Neural …, 2022 - proceedings.neurips.cc

Although no specific domain knowledge is considered in the design, plain vision
transformers have shown excellent performance in visual recognition tasks. However, little …

被引用次数：618 相关文章所有 5 个版本

[PDF] thecvf.com

Instructdiffusion: A generalist modeling interface for vision tasks

Z Geng, B Yang, T Hang, C Li, S Gu… - Proceedings of the …, 2024 - openaccess.thecvf.com

We present InstructDiffusion a unified and generic framework for aligning computer vision
tasks with human instructions. Unlike existing approaches that integrate prior knowledge …

被引用次数：78 相关文章所有 3 个版本

[PDF] thecvf.com

Mvimgnet: A large-scale dataset of multi-view images

X Yu, M Xu, Y Zhang, H Liu, C Ye… - Proceedings of the …, 2023 - openaccess.thecvf.com

Being data-driven is one of the most iconic properties of deep learning algorithms. The birth
of ImageNet drives a remarkable trend of" learning from large-scale data" in computer vision …

被引用次数：132 相关文章所有 5 个版本

[PDF] thecvf.com

Bedlam: A synthetic dataset of bodies exhibiting detailed lifelike animated motion

MJ Black, P Patel, J Tesch… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

We show, for the first time, that neural networks trained only on synthetic data achieve state-
of-the-art accuracy on the problem of 3D human pose and shape (HPS) estimation from real …

被引用次数：119 相关文章所有 5 个版本

[PDF] arxiv.org

Cliff: Carrying location information in full frames into human pose and shape estimation

Z Li, J Liu, Z Zhang, S Xu, Y Yan - European Conference on Computer …, 2022 - Springer

Top-down methods dominate the field of 3D human pose and shape estimation, because
they are decoupled from human detection and allow researchers to focus on the core …

被引用次数：227 相关文章所有 5 个版本