Videoretalking: Audio-based lip synchronization for talking head video editing in the wild

Sadtalker: Learning realistic 3d motion coefficients for stylized audio-driven single image talking face animation

W Zhang, X Cun, X Wang, Y Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Generating talking head videos through a face image and a piece of speech audio still
contains many challenges. ie, unnatural head movement, distorted expression, and identity …

被引用次数：220 相关文章所有 7 个版本

[PDF] thecvf.com

Dpe: Disentanglement of pose and expression for general video portrait editing

Y Pang, Y Zhang, W Quan, Y Fan… - Proceedings of the …, 2023 - openaccess.thecvf.com

One-shot video-driven talking face generation aims at producing a synthetic talking video by
transferring the facial motion from a video to an arbitrary portrait image. Head pose and …

被引用次数：46 相关文章所有 6 个版本

[PDF] thecvf.com

Synctalk: The devil is in the synchronization for talking head synthesis

Z Peng, W Hu, Y Shi, X Zhu, X Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com

Achieving high synchronization in the synthesis of realistic speech-driven talking head
videos presents a significant challenge. Traditional Generative Adversarial Networks (GAN) …

被引用次数：28 相关文章所有 3 个版本

[HTML] mdpi.com

[HTML][HTML] Video and audio deepfake datasets and open issues in deepfake technology: being ahead of the curve

Z Akhtar, TL Pendyala, VS Athmakuri - Forensic Sciences, 2024 - mdpi.com

The revolutionary breakthroughs in Machine Learning (ML) and Artificial Intelligence (AI) are
extensively being harnessed across a diverse range of domains, eg, forensic science …

被引用次数：5 相关文章

[PDF] acm.org

Emotional speech-driven animation with content-emotion disentanglement

R Daněček, K Chhatre, S Tripathi, Y Wen… - SIGGRAPH Asia 2023 …, 2023 - dl.acm.org

To be widely adopted, 3D facial avatars must be animated easily, realistically, and directly
from speech signals. While the best recent methods generate 3D animations that are …

被引用次数：45 相关文章所有 4 个版本

[PDF] acm.org Full View

Nofa: Nerf-based one-shot facial avatar reconstruction

W Yu, Y Fan, Y Zhang, X Wang, F Yin, Y Bai… - ACM SIGGRAPH 2023 …, 2023 - dl.acm.org

3D facial avatar reconstruction has been a significant research topic in computer graphics
and computer vision, where photo-realistic rendering and flexible controls over poses and …

被引用次数：27 相关文章所有 3 个版本

[PDF] pnas.org Full View

The making of an AI news anchor—and its implications

M Bohacek, H Farid - … of the National Academy of Sciences, 2024 - National Acad Sciences

This summer saw a months-long strike that pitted writers and performers against major
Hollywood studios. A particularly fraught point of contention centered around the use (or not) …

被引用次数：9 相关文章所有 7 个版本

[PDF] arxiv.org

Vasa-1: Lifelike audio-driven talking faces generated in real time

S Xu, G Chen, YX Guo, J Yang, C Li, Z Zang… - arXiv preprint arXiv …, 2024 - arxiv.org

We introduce VASA, a framework for generating lifelike talking faces with appealing visual
affective skills (VAS) given a single static image and a speech audio clip. Our premiere …

被引用次数：50 相关文章所有 2 个版本

[PDF] arxiv.org

Learning to generate conditional tri-plane for 3d-aware expression controllable portrait animation

T Ki, D Min, G Chae - European Conference on Computer Vision, 2025 - Springer

In this paper, we present\(\text {Export3D}\), a one-shot 3D-aware portrait animation method
that is able to control the facial expression and camera view of a given portrait image. To …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

Context-aware talking-head video editing

S Yang, W Wang, J Ling, B Peng, X Tan… - Proceedings of the 31st …, 2023 - dl.acm.org

Talking-head video editing aims to efficiently insert, delete, and substitute the word of a pre-
recorded video through a text transcript editor. The key challenge for this task is obtaining an …

被引用次数：8 相关文章所有 3 个版本