Deep learning for visual speech analysis: A survey
Visual speech, referring to the visual domain of speech, has attracted increasing attention
due to its wide applications, such as public security, medical treatment, military defense, and …
due to its wide applications, such as public security, medical treatment, military defense, and …
EMO: Emote Portrait Alive Generating Expressive Portrait Videos with Audio2Video Diffusion Model Under Weak Conditions
In this work, we tackle the challenge of enhancing the realism and expressiveness in talking
head video generation by focusing on the dynamic and nuanced relationship between audio …
head video generation by focusing on the dynamic and nuanced relationship between audio …
Photomaker: Customizing realistic human photos via stacked id embedding
Recent advances in text-to-image generation have made remarkable progress in
synthesizing realistic human photos conditioned on given text prompts. However existing …
synthesizing realistic human photos conditioned on given text prompts. However existing …
Deepfakes as a threat to a speaker and facial recognition: An overview of tools and attack vectors
Deepfakes present an emerging threat in cyberspace. Recent developments in machine
learning make deepfakes highly believable, and very difficult to differentiate between what is …
learning make deepfakes highly believable, and very difficult to differentiate between what is …
Livelyspeaker: Towards semantic-aware co-speech gesture generation
Gestures are non-verbal but important behaviors accompanying people's speech. While
previous methods are able to generate speech rhythm-synchronized gestures, the semantic …
previous methods are able to generate speech rhythm-synchronized gestures, the semantic …
Mofa-video: Controllable image animation via generative motion field adaptions in frozen image-to-video diffusion model
We present MOFA-Video, an advanced controllable image animation method that generates
video from the given image using various additional controllable signals (such as human …
video from the given image using various additional controllable signals (such as human …
Portraitbooth: A versatile portrait model for fast identity-preserved personalization
Recent advancements in personalized image generation using diffusion models have been
noteworthy. However existing methods suffer from inefficiencies due to the requirement for …
noteworthy. However existing methods suffer from inefficiencies due to the requirement for …
Follow-your-emoji: Fine-controllable and expressive freestyle portrait animation
We present Follow-Your-Emoji, a diffusion-based framework for portrait animation, which
animates a reference portrait with target landmark sequences. The main challenge of portrait …
animates a reference portrait with target landmark sequences. The main challenge of portrait …
Dreamtalk: When expressive talking head generation meets diffusion probabilistic models
Diffusion models have shown remarkable success in a variety of downstream generative
tasks, yet remain under-explored in the important and challenging expressive talking head …
tasks, yet remain under-explored in the important and challenging expressive talking head …
Diffposetalk: Speech-driven stylistic 3d facial animation and head pose generation via diffusion models
The generation of stylistic 3D facial animations driven by speech presents a significant
challenge as it requires learning a many-to-many mapping between speech, style, and the …
challenge as it requires learning a many-to-many mapping between speech, style, and the …