State of the art on monocular 3D face reconstruction, tracking, and applications
The computer graphics and vision communities have dedicated long standing efforts in
building computerized tools for reconstructing, tracking, and analyzing human faces based …
building computerized tools for reconstructing, tracking, and analyzing human faces based …
Deep learning for visual speech analysis: A survey
C Sheng, G Kuang, L Bai, C Hou, Y Guo… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Visual speech, referring to the visual domain of speech, has attracted increasing attention
due to its wide applications, such as public security, medical treatment, military defense, and …
due to its wide applications, such as public security, medical treatment, military defense, and …
Codetalker: Speech-driven 3d facial animation with discrete motion prior
Speech-driven 3D facial animation has been widely studied, yet there is still a gap to
achieving realism and vividness due to the highly ill-posed nature and scarcity of audio …
achieving realism and vividness due to the highly ill-posed nature and scarcity of audio …
Ad-nerf: Audio driven neural radiance fields for talking head synthesis
Generating high-fidelity talking head video by fitting with the input audio sequence is a
challenging problem that receives considerable attentions recently. In this paper, we …
challenging problem that receives considerable attentions recently. In this paper, we …
Flow-guided one-shot talking face generation with a high-resolution audio-visual dataset
One-shot talking face generation should synthesize high visual quality facial videos with
reasonable animations of expression and head pose, and just utilize arbitrary driving audio …
reasonable animations of expression and head pose, and just utilize arbitrary driving audio …
Styleheat: One-shot high-resolution editable talking face generation via pre-trained stylegan
One-shot talking face generation aims at synthesizing a high-quality talking face video from
an arbitrary portrait image, driven by a video or an audio segment. In this work, we provide a …
an arbitrary portrait image, driven by a video or an audio segment. In this work, we provide a …
Diffused heads: Diffusion models beat gans on talking-face generation
M Stypułkowski, K Vougioukas, S He… - Proceedings of the …, 2024 - openaccess.thecvf.com
Talking face generation has historically struggled to produce head movements and natural
facial expressions without guidance from additional reference videos. Recent developments …
facial expressions without guidance from additional reference videos. Recent developments …
Live speech portraits: real-time photorealistic talking-head animation
To the best of our knowledge, we first present a live system that generates personalized
photorealistic talking-head animation only driven by audio signals at over 30 fps. Our system …
photorealistic talking-head animation only driven by audio signals at over 30 fps. Our system …
Makelttalk: speaker-aware talking-head animation
We present a method that generates expressive talking-head videos from a single facial
image with audio as the only input. In contrast to previous attempts to learn direct mappings …
image with audio as the only input. In contrast to previous attempts to learn direct mappings …
Meshtalk: 3d face animation from speech using cross-modality disentanglement
This paper presents a generic method for generating full facial 3D animation from speech.
Existing approaches to audio-driven facial animation exhibit uncanny or static upper face …
Existing approaches to audio-driven facial animation exhibit uncanny or static upper face …