Speech gesture generation from the trimodal context of text, audio, and speaker identity

R Po, W Yifan, V Golyanik, K Aberman… - Computer Graphics …, 2024 - Wiley Online Library

The field of visual computing is rapidly advancing due to the emergence of generative
artificial intelligence (AI), which unlocks unprecedented capabilities for the generation …

被引用次数：88 相关文章所有 12 个版本

[PDF] arxiv.org

A Comprehensive Review of Data‐Driven Co‐Speech Gesture Generation

S Nyatsanga, T Kucherenko, C Ahuja… - Computer Graphics …, 2023 - Wiley Online Library

Gestures that accompany speech are an essential part of natural and efficient embodied
human communication. The automatic generation of such co‐speech gestures is a long …

被引用次数：79 相关文章所有 14 个版本

[PDF] thecvf.com

Taming diffusion models for audio-driven co-speech gesture generation

L Zhu, X Liu, X Liu, R Qian, Z Liu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Animating virtual avatars to make co-speech gestures facilitates various applications in
human-machine interaction. The existing methods mainly rely on generative adversarial …

被引用次数：99 相关文章所有 7 个版本

[PDF] arxiv.org

Gesturediffuclip: Gesture diffusion model with clip latents

T Ao, Z Zhang, L Liu - ACM Transactions on Graphics (TOG), 2023 - dl.acm.org

The automatic generation of stylized co-speech gestures has recently received increasing
attention. Previous systems typically allow style control via predefined text labels or example …

被引用次数：129 相关文章所有 3 个版本

[PDF] thecvf.com

Generating holistic 3d human motion from speech

H Yi, H Liang, Y Liu, Q Cao, Y Wen… - Proceedings of the …, 2023 - openaccess.thecvf.com

This work addresses the problem of generating 3D holistic body motions from human
speech. Given a speech recording, we synthesize sequences of 3D body poses, hand …

被引用次数：127 相关文章所有 7 个版本

[PDF] acm.org

Listen, denoise, action! audio-driven motion synthesis with diffusion models

S Alexanderson, R Nagy, J Beskow… - ACM Transactions on …, 2023 - dl.acm.org

Diffusion models have experienced a surge of interest as highly expressive yet efficiently
trainable probabilistic models. We show that these models are an excellent fit for …

被引用次数：151 相关文章所有 4 个版本

[PDF] arxiv.org

Rhythmic gesticulator: Rhythm-aware co-speech gesture synthesis with hierarchical neural embeddings

T Ao, Q Gao, Y Lou, B Chen, L Liu - ACM Transactions on Graphics …, 2022 - dl.acm.org

Automatic synthesis of realistic co-speech gestures is an increasingly important yet
challenging task in artificial embodied agent creation. Previous systems mainly focus on …

被引用次数：108 相关文章所有 3 个版本

[PDF] thecvf.com

Learning hierarchical cross-modal association for co-speech gesture generation

X Liu, Q Wu, H Zhou, Y Xu, R Qian… - Proceedings of the …, 2022 - openaccess.thecvf.com

Generating speech-consistent body and gesture movements is a long-standing problem in
virtual avatar creation. Previous studies often synthesize pose movement in a holistic …

被引用次数：117 相关文章所有 5 个版本

[PDF] archive.org

Strategies for Parent Involvement During Distance Learning in Arabic Lessons in Elementary Schools.

A Kartel, M Charles, H Xiao… - … : Journal International of …, 2022 - search.ebscohost.com

This study aims to find various obstacles, describe, and provide strategies for parents when
accompanying and providing direction to their children in distance learning. The method …

被引用次数：115 相关文章所有 3 个版本

[PDF] arxiv.org

Beat: A large-scale semantic and emotional multi-modal dataset for conversational gestures synthesis

H Liu, Z Zhu, N Iwamoto, Y Peng, Z Li, Y Zhou… - European conference on …, 2022 - Springer

Achieving realistic, vivid, and human-like synthesized conversational gestures conditioned
on multi-modal data is still an unsolved problem due to the lack of available datasets …

被引用次数：132 相关文章所有 7 个版本