Recovering 3d human mesh from monocular images: A survey

Y Tian, H Zhang, Y Liu, L Wang - IEEE transactions on pattern …, 2023 - ieeexplore.ieee.org
Estimating human pose and shape from monocular images is a long-standing problem in
computer vision. Since the release of statistical body models, 3D human mesh recovery has …

State of the art on diffusion models for visual computing

R Po, W Yifan, V Golyanik, K Aberman… - Computer Graphics …, 2024 - Wiley Online Library
The field of visual computing is rapidly advancing due to the emergence of generative
artificial intelligence (AI), which unlocks unprecedented capabilities for the generation …

Effective whole-body pose estimation with two-stages distillation

Z Yang, A Zeng, C Yuan, Y Li - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Whole-body pose estimation localizes the human body, hand, face, and foot keypoints in an
image. This task is challenging due to multi-scale body parts, fine-grained localization for …

Grounded sam: Assembling open-world models for diverse visual tasks

T Ren, S Liu, A Zeng, J Lin, K Li, H Cao, J Chen… - arXiv preprint arXiv …, 2024 - arxiv.org
We introduce Grounded SAM, which uses Grounding DINO as an open-set object detector to
combine with the segment anything model (SAM). This integration enables the detection and …

Humanmac: Masked motion completion for human motion prediction

LH Chen, J Zhang, Y Li, Y Pang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Human motion prediction is a classical problem in computer vision and computer graphics,
which has a wide range of practical applications. Previous effects achieve great empirical …

Miradata: A large-scale video dataset with long durations and structured captions

X Ju, Y Gao, Z Zhang, Z Yuan, X Wang, A Zeng… - arXiv preprint arXiv …, 2024 - arxiv.org
Sora's high-motion intensity and long consistent videos have significantly impacted the field
of video generation, attracting unprecedented attention. However, existing publicly available …

Large motion model for unified multi-modal motion generation

M Zhang, D Jin, C Gu, F Hong, Z Cai, J Huang… - … on Computer Vision, 2025 - Springer
Human motion generation, a cornerstone technique in animation and video production, has
widespread applications in various tasks like text-to-motion and music-to-dance. Previous …

360-degree Human Video Generation with 4D Diffusion Transformer

R Shao, Y Pang, Z Zheng, J Sun, Y Liu - ACM Transactions on Graphics …, 2024 - dl.acm.org
We present a novel approach for generating 360-degree high-quality, spatiotemporally
coherent human videos from a single image. Our framework combines the strengths of …

Imugpt 2.0: Language-based cross modality transfer for sensor-based human activity recognition

Z Leng, A Bhattacharjee, H Rajasekhar… - Proceedings of the …, 2024 - dl.acm.org
One of the primary challenges in the field of human activity recognition (HAR) is the lack of
large labeled datasets. This hinders the development of robust and generalizable models …

AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents

J Cui, T Liu, N Liu, Y Yang, Y Zhu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Traditional approaches in physics-based motion generation centered around imitation
learning and reward shaping often struggle to adapt to new scenarios. To tackle this …