Application of a 3D Talking Head as Part of Telecommunication AR, VR, MR System: Systematic Review
In today's digital era, the realms of virtual reality (VR), augmented reality (AR), and mixed
reality (MR) collectively referred to as extended reality (XR) are reshaping human–computer …
reality (MR) collectively referred to as extended reality (XR) are reshaping human–computer …
Probabilistic Speech-Driven 3D Facial Motion Synthesis: New Benchmarks Methods and Applications
We consider the task of animating 3D facial geometry from speech signal. Existing works are
primarily deterministic focusing on learning a one-to-one mapping from speech signal to 3D …
primarily deterministic focusing on learning a one-to-one mapping from speech signal to 3D …
Increasing Importance of Joint Analysis of Audio and Video in Computer Vision: A Survey
A Shahabaz, S Sarkar - IEEE Access, 2024 - ieeexplore.ieee.org
The joint analysis of audio and video is a powerful tool that can be applied to various
contexts, including action, speech, and sound recognition, audio-visual video parsing …
contexts, including action, speech, and sound recognition, audio-visual video parsing …
VQ-NeRF: Vector Quantization Enhances Implicit Neural Representations
Recent advancements in implicit neural representations have contributed to high-fidelity
surface reconstruction and photorealistic novel view synthesis. However, the computational …
surface reconstruction and photorealistic novel view synthesis. However, the computational …
Pose-aware 3D talking face synthesis using geometry-guided audio-vertices attention
Most of the existing 3D talking face synthesis methods suffer from the lack of detailed facial
expressions and realistic head poses, resulting in unsatisfactory experiences for users. In …
expressions and realistic head poses, resulting in unsatisfactory experiences for users. In …
Understanding Deep Face Representation via Attribute Recovery
Deep neural networks have proven to be highly effective in the face recognition task, as they
can map raw samples into a discriminative high-dimensional representation space …
can map raw samples into a discriminative high-dimensional representation space …
Style-Preserving Lip Sync via Audio-Aware Style Reference
Audio-driven lip sync has recently drawn significant attention due to its widespread
application in the multimedia domain. Individuals exhibit distinct lip shapes when speaking …
application in the multimedia domain. Individuals exhibit distinct lip shapes when speaking …
High-fidelity and Lip-synced Talking Face Synthesis via Landmark-based Diffusion Model
Audio-driven talking face video generation has attracted increasing attention due to its huge
industrial potential. Some previous methods focus on learning a direct mapping from audio …
industrial potential. Some previous methods focus on learning a direct mapping from audio …
AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation
While considerable progress has been made in achieving accurate lip synchronization for
3D speech-driven talking face generation, the task of incorporating expressive facial detail …
3D speech-driven talking face generation, the task of incorporating expressive facial detail …
MIMAFace: Face Animation via Motion-Identity Modulated Appearance Feature Learning
Current diffusion-based face animation methods generally adopt a ReferenceNet (a copy of
U-Net) and a large amount of curated self-acquired data to learn appearance features, as …
U-Net) and a large amount of curated self-acquired data to learn appearance features, as …