Real-time talking head driven by voice and its application to communication and entertainment

Z Peng, W Hu, Y Shi, X Zhu, X Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com

Achieving high synchronization in the synthesis of realistic speech-driven talking head
videos presents a significant challenge. Traditional Generative Adversarial Networks (GAN) …

被引用次数：14 相关文章所有 3 个版本

[图书][B] Handbook of virtual humans

N Magnenat-Thalmann, D Thalmann - 2005 - books.google.com

Virtual Humans are becoming more and more popular and used in many applications such
as the entertainment industry (in both film and games) and medical applications. This …

被引用次数：195 相关文章所有 6 个版本

Real-time speech-driven face animation with expressions using neural networks

P Hong, Z Wen, TS Huang - IEEE Transactions on neural …, 2002 - ieeexplore.ieee.org

A real-time speech-driven synthetic talking face provides an effective multimodal
communication interface in distributed collaboration environments. Nonverbal gestures such …

被引用次数：137 相关文章所有 9 个版本

[PDF] researchgate.net

Intelligent virtual humans with autonomy and personality: State-of-the-art

Z Kasap, N Magnenat-Thalmann - Intelligent Decision …, 2007 - content.iospress.com

Intelligent virtual characters has been subject to exponential growth in the last decades and
they are utilized in many application areas such as education, training, human-computer …

被引用次数：105 相关文章所有 14 个版本

Emotional expressions in audiovisual human computer interaction

LS Chen, TS Huang - … Proceedings. Latest Advances in the Fast …, 2000 - ieeexplore.ieee.org

Visual and auditory modalities are two of the most commonly used media in interactions
between humans. The authors describe a system to continuously monitor the user's voice …

被引用次数：119 相关文章所有 3 个版本

[HTML] diva-portal.org

Picture my voice: Audio to visual speech synthesis using artificial neural networks

DW Massaro, J Beskow, MM Cohen… - AVSP'99-International …, 1999 - isca-speech.org

This paper presents an initial implementation and evaluation of a system that synthesizes
visual speech directly from the acoustic waveform. An artifical neural network (ANN) was …

被引用次数：126 相关文章所有 14 个版本

[PDF] diva-portal.org

Talking heads-Models and applications for multimodal speech synthesis

J Beskow - 2003 - diva-portal.org

This thesis presents work in the area of computer animated talking heads. A system for
multimodal speech synthesis has been developed, capable of generating audiovisual …

被引用次数：131 相关文章所有 4 个版本

[PDF] googleapis.com

System and method for real time lip synchronization

Y Huang, S Ssu-te Lin, B Guo, HY Shum - US Patent 7,133,535, 2006 - Google Patents

This invention is directed toward a system and method for lip Synchronization. More
specifically, this invention is directed towards a system and method for generating a …

被引用次数：71 相关文章所有 4 个版本

[PDF] researchgate.net

Audio/visual mapping with cross-modal hidden Markov models

S Fu, R Gutierrez-Osuna, A Esposito… - IEEE Transactions …, 2005 - ieeexplore.ieee.org

The audio/visual mapping problem of speech-driven facial animation has intrigued
researchers for years. Recent research efforts have demonstrated that hidden Markov model …

被引用次数：74 相关文章所有 14 个版本

[PDF] arxiv.org

Style transfer for 2d talking head animation

TT Pham, N Le, T Do, H Nguyen, E Tjiputra… - arXiv preprint arXiv …, 2023 - arxiv.org

Audio-driven talking head animation is a challenging research topic with many real-world
applications. Recent works have focused on creating photo-realistic 2D animation, while …

被引用次数：3 相关文章所有 2 个版本