Deep learning for visual speech analysis: A survey

C Sheng, G Kuang, L Bai, C Hou, Y Guo… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Visual speech, referring to the visual domain of speech, has attracted increasing attention
due to its wide applications, such as public security, medical treatment, military defense, and …

Sadtalker: Learning realistic 3d motion coefficients for stylized audio-driven single image talking face animation

W Zhang, X Cun, X Wang, Y Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Generating talking head videos through a face image and a piece of speech audio still
contains many challenges. ie, unnatural head movement, distorted expression, and identity …

Metaportrait: Identity-preserving talking head generation with fast personalized adaptation

B Zhang, C Qi, P Zhang, B Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com
In this work, we propose an ID-preserving talking head generation framework, which
advances previous methods in two aspects. First, as opposed to interpolating from sparse …

Dpe: Disentanglement of pose and expression for general video portrait editing

Y Pang, Y Zhang, W Quan, Y Fan… - Proceedings of the …, 2023 - openaccess.thecvf.com
One-shot video-driven talking face generation aims at producing a synthetic talking video by
transferring the facial motion from a video to an arbitrary portrait image. Head pose and …

Diffposetalk: Speech-driven stylistic 3d facial animation and head pose generation via diffusion models

Z Sun, T Lv, S Ye, M Lin, J Sheng, YH Wen… - ACM Transactions on …, 2024 - dl.acm.org
The generation of stylistic 3D facial animations driven by speech presents a significant
challenge as it requires learning a many-to-many mapping between speech, style, and the …

Application of a 3D Talking Head as Part of Telecommunication AR, VR, MR System: Systematic Review

N Christoff, NN Neshov, K Tonchev, A Manolova - Electronics, 2023 - mdpi.com
In today's digital era, the realms of virtual reality (VR), augmented reality (AR), and mixed
reality (MR) collectively referred to as extended reality (XR) are reshaping human–computer …

Diffsheg: A diffusion-based approach for real-time speech-driven holistic 3d expression and gesture generation

J Chen, Y Liu, J Wang, A Zeng, Y Li… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract We propose DiffSHEG a Diffusion-based approach for Speech-driven Holistic 3D
Expression and Gesture generation. While previous works focused on co-speech gesture or …

Emotional speech-driven animation with content-emotion disentanglement

R Daněček, K Chhatre, S Tripathi, Y Wen… - SIGGRAPH Asia 2023 …, 2023 - dl.acm.org
To be widely adopted, 3D facial avatars must be animated easily, realistically, and directly
from speech signals. While the best recent methods generate 3D animations that are …

Facetalk: Audio-driven motion diffusion for neural parametric head models

S Aneja, J Thies, A Dai… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
We introduce FaceTalk a novel generative approach designed for synthesizing high-fidelity
3D motion sequences of talking human heads from input audio signal. To capture the …

ToonTalker: Cross-domain face reenactment

Y Gong, Y Zhang, X Cun, F Yin, Y Fan… - Proceedings of the …, 2023 - openaccess.thecvf.com
We target cross-domain face reenactment in this paper, ie, driving a cartoon image with the
video of a real person and vice versa. Recently, many works have focused on one-shot …