Sadtalker: Learning realistic 3d motion coefficients for stylized audio-driven single image talking face animation

W Zhang, X Cun, X Wang, Y Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Generating talking head videos through a face image and a piece of speech audio still
contains many challenges. ie, unnatural head movement, distorted expression, and identity …

Dpe: Disentanglement of pose and expression for general video portrait editing

Y Pang, Y Zhang, W Quan, Y Fan… - Proceedings of the …, 2023 - openaccess.thecvf.com
One-shot video-driven talking face generation aims at producing a synthetic talking video by
transferring the facial motion from a video to an arbitrary portrait image. Head pose and …

Synctalk: The devil is in the synchronization for talking head synthesis

Z Peng, W Hu, Y Shi, X Zhu, X Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Achieving high synchronization in the synthesis of realistic speech-driven talking head
videos presents a significant challenge. Traditional Generative Adversarial Networks (GAN) …

[HTML][HTML] Video and audio deepfake datasets and open issues in deepfake technology: being ahead of the curve

Z Akhtar, TL Pendyala, VS Athmakuri - Forensic Sciences, 2024 - mdpi.com
The revolutionary breakthroughs in Machine Learning (ML) and Artificial Intelligence (AI) are
extensively being harnessed across a diverse range of domains, eg, forensic science …

Emotional speech-driven animation with content-emotion disentanglement

R Daněček, K Chhatre, S Tripathi, Y Wen… - SIGGRAPH Asia 2023 …, 2023 - dl.acm.org
To be widely adopted, 3D facial avatars must be animated easily, realistically, and directly
from speech signals. While the best recent methods generate 3D animations that are …

Nofa: Nerf-based one-shot facial avatar reconstruction

W Yu, Y Fan, Y Zhang, X Wang, F Yin, Y Bai… - ACM SIGGRAPH 2023 …, 2023 - dl.acm.org
3D facial avatar reconstruction has been a significant research topic in computer graphics
and computer vision, where photo-realistic rendering and flexible controls over poses and …

The making of an AI news anchor—and its implications

M Bohacek, H Farid - … of the National Academy of Sciences, 2024 - National Acad Sciences
This summer saw a months-long strike that pitted writers and performers against major
Hollywood studios. A particularly fraught point of contention centered around the use (or not) …

Vasa-1: Lifelike audio-driven talking faces generated in real time

S Xu, G Chen, YX Guo, J Yang, C Li, Z Zang… - arXiv preprint arXiv …, 2024 - arxiv.org
We introduce VASA, a framework for generating lifelike talking faces with appealing visual
affective skills (VAS) given a single static image and a speech audio clip. Our premiere …

Learning to generate conditional tri-plane for 3d-aware expression controllable portrait animation

T Ki, D Min, G Chae - European Conference on Computer Vision, 2025 - Springer
In this paper, we present\(\text {Export3D}\), a one-shot 3D-aware portrait animation method
that is able to control the facial expression and camera view of a given portrait image. To …

Context-aware talking-head video editing

S Yang, W Wang, J Ling, B Peng, X Tan… - Proceedings of the 31st …, 2023 - dl.acm.org
Talking-head video editing aims to efficiently insert, delete, and substitute the word of a pre-
recorded video through a text transcript editor. The key challenge for this task is obtaining an …