Dpe: Disentanglement of pose and expression for general video portrait editing

Sadtalker: Learning realistic 3d motion coefficients for stylized audio-driven single image talking face animation

W Zhang, X Cun, X Wang, Y Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Generating talking head videos through a face image and a piece of speech audio still
contains many challenges. ie, unnatural head movement, distorted expression, and identity …

被引用次数：220 相关文章所有 7 个版本

[PDF] thecvf.com

Codetalker: Speech-driven 3d facial animation with discrete motion prior

J Xing, M Xia, Y Zhang, X Cun… - Proceedings of the …, 2023 - openaccess.thecvf.com

Speech-driven 3D facial animation has been widely studied, yet there is still a gap to
achieving realism and vividness due to the highly ill-posed nature and scarcity of audio …

被引用次数：142 相关文章所有 8 个版本

[PDF] thecvf.com

Metaportrait: Identity-preserving talking head generation with fast personalized adaptation

B Zhang, C Qi, P Zhang, B Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com

In this work, we propose an ID-preserving talking head generation framework, which
advances previous methods in two aspects. First, as opposed to interpolating from sparse …

被引用次数：50 相关文章所有 6 个版本

[PDF] arxiv.org

Edtalk: Efficient disentanglement for emotional talking head synthesis

S Tan, B Ji, M Bi, Y Pan - European Conference on Computer Vision, 2025 - Springer

Achieving disentangled control over multiple facial motions and accommodating diverse
input modalities greatly enhances the application and entertainment of the talking head …

被引用次数：13 相关文章所有 2 个版本

[PDF] arxiv.org

Deepfake generation and detection: A benchmark and survey

G Pei, J Zhang, M Hu, Z Zhang, C Wang, Y Wu… - arXiv preprint arXiv …, 2024 - arxiv.org

Deepfake is a technology dedicated to creating highly realistic facial images and videos
under specific conditions, which has significant application potential in fields such as …

被引用次数：24 相关文章所有 2 个版本

[PDF] arxiv.org

Aigc for various data modalities: A survey

LG Foo, H Rahmani, J Liu - arXiv preprint arXiv:2308.14177, 2023 - arxiv.org

AI-generated content (AIGC) methods aim to produce text, images, videos, 3D assets, and
other media using AI algorithms. Due to its wide range of applications and the demonstrated …

被引用次数：7 相关文章所有 2 个版本

[PDF] arxiv.org

Hallo: Hierarchical audio-driven visual synthesis for portrait image animation

M Xu, H Li, Q Su, H Shang, L Zhang, C Liu… - arXiv preprint arXiv …, 2024 - arxiv.org

The field of portrait image animation, driven by speech audio input, has experienced
significant advancements in the generation of realistic and dynamic portraits. This research …

被引用次数：13 相关文章所有 2 个版本

[PDF] thecvf.com

FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio

C Xu, Y Liu, J Xing, W Wang, M Sun… - Proceedings of the …, 2024 - openaccess.thecvf.com

In this paper we abstract the process of people hearing speech extracting meaningful cues
and creating various dynamically audio-consistent talking faces termed Listening and …

被引用次数：11 相关文章所有 5 个版本

[PDF] arxiv.org

Anitalker: animate vivid and diverse talking faces through identity-decoupled facial motion encoding

T Liu, F Chen, S Fan, C Du, Q Chen, X Chen… - Proceedings of the 32nd …, 2024 - dl.acm.org

The paper introduces AniTalker, an innovative framework designed to generate lifelike
talking faces from a single portrait. Unlike existing models that primarily focus on verbal cues …

被引用次数：7 相关文章所有 2 个版本

[PDF] researchgate.net

[PDF][PDF] Ai-generated content (aigc) for various data modalities: A survey

LG Foo, H Rahmani, J Liu - arXiv preprint arXiv:2308.14177, 2023 - researchgate.net

Amidst the rapid advancement of artificial intelligence (AI), the development of content
generation techniques stands out as one of the most captivating and widely discussed topics …

被引用次数：21 相关文章