Synctalk: The devil is in the synchronization for talking head synthesis

Z Peng, W Hu, Y Shi, X Zhu, X Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Achieving high synchronization in the synthesis of realistic speech-driven talking head
videos presents a significant challenge. Traditional Generative Adversarial Networks (GAN) …

[图书][B] Handbook of virtual humans

N Magnenat-Thalmann, D Thalmann - 2005 - books.google.com
Virtual Humans are becoming more and more popular and used in many applications such
as the entertainment industry (in both film and games) and medical applications. This …

Real-time speech-driven face animation with expressions using neural networks

P Hong, Z Wen, TS Huang - IEEE Transactions on neural …, 2002 - ieeexplore.ieee.org
A real-time speech-driven synthetic talking face provides an effective multimodal
communication interface in distributed collaboration environments. Nonverbal gestures such …

Intelligent virtual humans with autonomy and personality: State-of-the-art

Z Kasap, N Magnenat-Thalmann - Intelligent Decision …, 2007 - content.iospress.com
Intelligent virtual characters has been subject to exponential growth in the last decades and
they are utilized in many application areas such as education, training, human-computer …

Emotional expressions in audiovisual human computer interaction

LS Chen, TS Huang - … Proceedings. Latest Advances in the Fast …, 2000 - ieeexplore.ieee.org
Visual and auditory modalities are two of the most commonly used media in interactions
between humans. The authors describe a system to continuously monitor the user's voice …

Picture my voice: Audio to visual speech synthesis using artificial neural networks

DW Massaro, J Beskow, MM Cohen… - AVSP'99-International …, 1999 - isca-speech.org
This paper presents an initial implementation and evaluation of a system that synthesizes
visual speech directly from the acoustic waveform. An artifical neural network (ANN) was …

Talking heads-Models and applications for multimodal speech synthesis

J Beskow - 2003 - diva-portal.org
This thesis presents work in the area of computer animated talking heads. A system for
multimodal speech synthesis has been developed, capable of generating audiovisual …

System and method for real time lip synchronization

Y Huang, S Ssu-te Lin, B Guo, HY Shum - US Patent 7,133,535, 2006 - Google Patents
This invention is directed toward a system and method for lip Synchronization. More
specifically, this invention is directed towards a system and method for generating a …

Audio/visual mapping with cross-modal hidden Markov models

S Fu, R Gutierrez-Osuna, A Esposito… - IEEE Transactions …, 2005 - ieeexplore.ieee.org
The audio/visual mapping problem of speech-driven facial animation has intrigued
researchers for years. Recent research efforts have demonstrated that hidden Markov model …

Style transfer for 2d talking head animation

TT Pham, N Le, T Do, H Nguyen, E Tjiputra… - arXiv preprint arXiv …, 2023 - arxiv.org
Audio-driven talking head animation is a challenging research topic with many real-world
applications. Recent works have focused on creating photo-realistic 2D animation, while …