A Comprehensive Review of Data‐Driven Co‐Speech Gesture Generation
Gestures that accompany speech are an essential part of natural and efficient embodied
human communication. The automatic generation of such co‐speech gestures is a long …
human communication. The automatic generation of such co‐speech gestures is a long …
Human motion generation: A survey
Human motion generation aims to generate natural human pose sequences and shows
immense potential for real-world applications. Substantial progress has been made recently …
immense potential for real-world applications. Substantial progress has been made recently …
Sadtalker: Learning realistic 3d motion coefficients for stylized audio-driven single image talking face animation
Generating talking head videos through a face image and a piece of speech audio still
contains many challenges. ie, unnatural head movement, distorted expression, and identity …
contains many challenges. ie, unnatural head movement, distorted expression, and identity …
TEMOS: Generating Diverse Human Motions from Textual Descriptions
We address the problem of generating diverse 3D human motions from textual descriptions.
This challenging task requires joint modeling of both modalities: understanding and …
This challenging task requires joint modeling of both modalities: understanding and …
Mofusion: A framework for denoising-diffusion-based motion synthesis
Conventional methods for human motion synthesis have either been deterministic or have
had to struggle with the trade-off between motion diversity vs motion quality. In response to …
had to struggle with the trade-off between motion diversity vs motion quality. In response to …
Taming diffusion models for audio-driven co-speech gesture generation
Animating virtual avatars to make co-speech gestures facilitates various applications in
human-machine interaction. The existing methods mainly rely on generative adversarial …
human-machine interaction. The existing methods mainly rely on generative adversarial …
Gesturediffuclip: Gesture diffusion model with clip latents
T Ao, Z Zhang, L Liu - ACM Transactions on Graphics (TOG), 2023 - dl.acm.org
The automatic generation of stylized co-speech gestures has recently received increasing
attention. Previous systems typically allow style control via predefined text labels or example …
attention. Previous systems typically allow style control via predefined text labels or example …
Generating holistic 3d human motion from speech
This work addresses the problem of generating 3D holistic body motions from human
speech. Given a speech recording, we synthesize sequences of 3D body poses, hand …
speech. Given a speech recording, we synthesize sequences of 3D body poses, hand …
Bailando: 3d dance generation by actor-critic gpt with choreographic memory
Driving 3D characters to dance following a piece of music is highly challenging due to the
spatial constraints applied to poses by choreography norms. In addition, the generated …
spatial constraints applied to poses by choreography norms. In addition, the generated …
Ai choreographer: Music conditioned 3d dance generation with aist++
We present AIST++, a new multi-modal dataset of 3D dance motion and music, along with
FACT, a Full-Attention Cross-modal Transformer network for generating 3D dance motion …
FACT, a Full-Attention Cross-modal Transformer network for generating 3D dance motion …