A deep learning approach for generalized speech animation

M Zollhöfer, J Thies, P Garrido, D Bradley… - Computer graphics …, 2018 - Wiley Online Library

The computer graphics and vision communities have dedicated long standing efforts in
building computerized tools for reconstructing, tracking, and analyzing human faces based …

被引用次数：351 相关文章所有 9 个版本

[PDF] arxiv.org

Deep learning for visual speech analysis: A survey

C Sheng, G Kuang, L Bai, C Hou, Y Guo… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Visual speech, referring to the visual domain of speech, has attracted increasing attention
due to its wide applications, such as public security, medical treatment, military defense, and …

被引用次数：29 相关文章所有 9 个版本

[PDF] thecvf.com

Codetalker: Speech-driven 3d facial animation with discrete motion prior

J Xing, M Xia, Y Zhang, X Cun… - Proceedings of the …, 2023 - openaccess.thecvf.com

Speech-driven 3D facial animation has been widely studied, yet there is still a gap to
achieving realism and vividness due to the highly ill-posed nature and scarcity of audio …

被引用次数：92 相关文章所有 8 个版本

[PDF] thecvf.com

Ad-nerf: Audio driven neural radiance fields for talking head synthesis

Y Guo, K Chen, S Liang, YJ Liu… - Proceedings of the …, 2021 - openaccess.thecvf.com

Generating high-fidelity talking head video by fitting with the input audio sequence is a
challenging problem that receives considerable attentions recently. In this paper, we …

被引用次数：327 相关文章所有 7 个版本

[PDF] thecvf.com

Flow-guided one-shot talking face generation with a high-resolution audio-visual dataset

Z Zhang, L Li, Y Ding, C Fan - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com

One-shot talking face generation should synthesize high visual quality facial videos with
reasonable animations of expression and head pose, and just utilize arbitrary driving audio …

被引用次数：243 相关文章所有 5 个版本

[PDF] arxiv.org

Styleheat: One-shot high-resolution editable talking face generation via pre-trained stylegan

F Yin, Y Zhang, X Cun, M Cao, Y Fan, X Wang… - European conference on …, 2022 - Springer

One-shot talking face generation aims at synthesizing a high-quality talking face video from
an arbitrary portrait image, driven by a video or an audio segment. In this work, we provide a …

被引用次数：135 相关文章所有 6 个版本

[PDF] thecvf.com

Diffused heads: Diffusion models beat gans on talking-face generation

M Stypułkowski, K Vougioukas, S He… - Proceedings of the …, 2024 - openaccess.thecvf.com

Talking face generation has historically struggled to produce head movements and natural
facial expressions without guidance from additional reference videos. Recent developments …

被引用次数：79 相关文章所有 6 个版本

[PDF] acm.org

Live speech portraits: real-time photorealistic talking-head animation

Y Lu, J Chai, X Cao - ACM Transactions on Graphics (ToG), 2021 - dl.acm.org

To the best of our knowledge, we first present a live system that generates personalized
photorealistic talking-head animation only driven by audio signals at over 30 fps. Our system …

被引用次数：152 相关文章所有 4 个版本

[PDF] acm.org

Makelttalk: speaker-aware talking-head animation

Y Zhou, X Han, E Shechtman, J Echevarria… - ACM Transactions On …, 2020 - dl.acm.org

We present a method that generates expressive talking-head videos from a single facial
image with audio as the only input. In contrast to previous attempts to learn direct mappings …

被引用次数：388 相关文章所有 4 个版本

[PDF] thecvf.com

Meshtalk: 3d face animation from speech using cross-modality disentanglement

A Richard, M Zollhöfer, Y Wen… - Proceedings of the …, 2021 - openaccess.thecvf.com

This paper presents a generic method for generating full facial 3D animation from speech.
Existing approaches to audio-driven facial animation exhibit uncanny or static upper face …

被引用次数：176 相关文章所有 6 个版本