Application of a 3D Talking Head as Part of Telecommunication AR, VR, MR System: Systematic Review

N Christoff, NN Neshov, K Tonchev, A Manolova - Electronics, 2023 - mdpi.com
In today's digital era, the realms of virtual reality (VR), augmented reality (AR), and mixed
reality (MR) collectively referred to as extended reality (XR) are reshaping human–computer …

Probabilistic Speech-Driven 3D Facial Motion Synthesis: New Benchmarks Methods and Applications

KD Yang, A Ranjan, JHR Chang… - Proceedings of the …, 2024 - openaccess.thecvf.com
We consider the task of animating 3D facial geometry from speech signal. Existing works are
primarily deterministic focusing on learning a one-to-one mapping from speech signal to 3D …

Increasing Importance of Joint Analysis of Audio and Video in Computer Vision: A Survey

A Shahabaz, S Sarkar - IEEE Access, 2024 - ieeexplore.ieee.org
The joint analysis of audio and video is a powerful tool that can be applied to various
contexts, including action, speech, and sound recognition, audio-visual video parsing …

VQ-NeRF: Vector Quantization Enhances Implicit Neural Representations

Y Yang, W Liu, F Yin, X Chen, G Yu, J Fan… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent advancements in implicit neural representations have contributed to high-fidelity
surface reconstruction and photorealistic novel view synthesis. However, the computational …

Pose-aware 3D talking face synthesis using geometry-guided audio-vertices attention

B Li, X Wei, B Liu, Z He, J Cao… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Most of the existing 3D talking face synthesis methods suffer from the lack of detailed facial
expressions and realistic head poses, resulting in unsatisfactory experiences for users. In …

Understanding Deep Face Representation via Attribute Recovery

M Ren, Y Zhu, Y Wang, Y Huang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Deep neural networks have proven to be highly effective in the face recognition task, as they
can map raw samples into a discriminative high-dimensional representation space …

Style-Preserving Lip Sync via Audio-Aware Style Reference

W Zhong, J Li, Y Cai, L Lin, G Li - arXiv preprint arXiv:2408.05412, 2024 - arxiv.org
Audio-driven lip sync has recently drawn significant attention due to its widespread
application in the multimedia domain. Individuals exhibit distinct lip shapes when speaking …

High-fidelity and Lip-synced Talking Face Synthesis via Landmark-based Diffusion Model

W Zhong, J Lin, P Chen, L Lin, G Li - arXiv preprint arXiv:2408.05416, 2024 - arxiv.org
Audio-driven talking face video generation has attracted increasing attention due to its huge
industrial potential. Some previous methods focus on learning a direct mapping from audio …

AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation

Y Sun, W Chu, H Zhou, K Wang, H Koike - IEEE Access, 2024 - ieeexplore.ieee.org
While considerable progress has been made in achieving accurate lip synchronization for
3D speech-driven talking face generation, the task of incorporating expressive facial detail …

MIMAFace: Face Animation via Motion-Identity Modulated Appearance Feature Learning

Y Han, J Zhu, Y Feng, X Ji, K He, X Li, Y Liu - arXiv preprint arXiv …, 2024 - arxiv.org
Current diffusion-based face animation methods generally adopt a ReferenceNet (a copy of
U-Net) and a large amount of curated self-acquired data to learn appearance features, as …