A Comprehensive Review of Data‐Driven Co‐Speech Gesture Generation

S Nyatsanga, T Kucherenko, C Ahuja… - Computer Graphics …, 2023 - Wiley Online Library
Gestures that accompany speech are an essential part of natural and efficient embodied
human communication. The automatic generation of such co‐speech gestures is a long …

Human motion generation: A survey

W Zhu, X Ma, D Ro, H Ci, J Zhang, J Shi… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Human motion generation aims to generate natural human pose sequences and shows
immense potential for real-world applications. Substantial progress has been made recently …

Sadtalker: Learning realistic 3d motion coefficients for stylized audio-driven single image talking face animation

W Zhang, X Cun, X Wang, Y Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Generating talking head videos through a face image and a piece of speech audio still
contains many challenges. ie, unnatural head movement, distorted expression, and identity …

TEMOS: Generating Diverse Human Motions from Textual Descriptions

M Petrovich, MJ Black, G Varol - European Conference on Computer …, 2022 - Springer
We address the problem of generating diverse 3D human motions from textual descriptions.
This challenging task requires joint modeling of both modalities: understanding and …

Mofusion: A framework for denoising-diffusion-based motion synthesis

R Dabral, MH Mughal, V Golyanik… - Proceedings of the …, 2023 - openaccess.thecvf.com
Conventional methods for human motion synthesis have either been deterministic or have
had to struggle with the trade-off between motion diversity vs motion quality. In response to …

Taming diffusion models for audio-driven co-speech gesture generation

L Zhu, X Liu, X Liu, R Qian, Z Liu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Animating virtual avatars to make co-speech gestures facilitates various applications in
human-machine interaction. The existing methods mainly rely on generative adversarial …

Gesturediffuclip: Gesture diffusion model with clip latents

T Ao, Z Zhang, L Liu - ACM Transactions on Graphics (TOG), 2023 - dl.acm.org
The automatic generation of stylized co-speech gestures has recently received increasing
attention. Previous systems typically allow style control via predefined text labels or example …

Generating holistic 3d human motion from speech

H Yi, H Liang, Y Liu, Q Cao, Y Wen… - Proceedings of the …, 2023 - openaccess.thecvf.com
This work addresses the problem of generating 3D holistic body motions from human
speech. Given a speech recording, we synthesize sequences of 3D body poses, hand …

Bailando: 3d dance generation by actor-critic gpt with choreographic memory

L Siyao, W Yu, T Gu, C Lin, Q Wang… - Proceedings of the …, 2022 - openaccess.thecvf.com
Driving 3D characters to dance following a piece of music is highly challenging due to the
spatial constraints applied to poses by choreography norms. In addition, the generated …

Ai choreographer: Music conditioned 3d dance generation with aist++

R Li, S Yang, DA Ross… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
We present AIST++, a new multi-modal dataset of 3D dance motion and music, along with
FACT, a Full-Attention Cross-modal Transformer network for generating 3D dance motion …