Multimodal image synthesis and editing: A survey and taxonomy
As information exists in various modalities in real world, effective interaction and fusion
among multimodal information plays a key role for the creation and perception of multimodal …
among multimodal information plays a key role for the creation and perception of multimodal …
Deepfakes generation and detection: State-of-the-art, open challenges, countermeasures, and way forward
Easy access to audio-visual content on social media, combined with the availability of
modern tools such as Tensorflow or Keras, and open-source trained models, along with …
modern tools such as Tensorflow or Keras, and open-source trained models, along with …
Sadtalker: Learning realistic 3d motion coefficients for stylized audio-driven single image talking face animation
Generating talking head videos through a face image and a piece of speech audio still
contains many challenges. ie, unnatural head movement, distorted expression, and identity …
contains many challenges. ie, unnatural head movement, distorted expression, and identity …
Expressive talking head generation with granular audio-visual control
Generating expressive talking heads is essential for creating virtual humans. However,
existing one-or few-shot methods focus on lip-sync and head motion, ignoring the emotional …
existing one-or few-shot methods focus on lip-sync and head motion, ignoring the emotional …
Styleheat: One-shot high-resolution editable talking face generation via pre-trained stylegan
One-shot talking face generation aims at synthesizing a high-quality talking face video from
an arbitrary portrait image, driven by a video or an audio segment. In this work, we provide a …
an arbitrary portrait image, driven by a video or an audio segment. In this work, we provide a …
Eamm: One-shot emotional talking face via audio-based emotion-aware motion model
Although significant progress has been made to audio-driven talking face generation,
existing methods either neglect facial emotion or cannot be applied to arbitrary subjects. In …
existing methods either neglect facial emotion or cannot be applied to arbitrary subjects. In …
Learning dynamic facial radiance fields for few-shot talking head synthesis
Talking head synthesis is an emerging technology with wide applications in film dubbing,
virtual avatars and online education. Recent NeRF-based methods generate more natural …
virtual avatars and online education. Recent NeRF-based methods generate more natural …
Identity-preserving talking face generation with landmark and appearance priors
Generating talking face videos from audio attracts lots of research interest. A few person-
specific methods can generate vivid videos but require the target speaker's videos for …
specific methods can generate vivid videos but require the target speaker's videos for …
Progressive disentangled representation learning for fine-grained controllable talking head synthesis
We present a novel one-shot talking head synthesis method that achieves disentangled and
fine-grained control over lip motion, eye gaze&blink, head pose, and emotional expression …
fine-grained control over lip motion, eye gaze&blink, head pose, and emotional expression …
Efficient region-aware neural radiance fields for high-fidelity talking portrait synthesis
This paper presents ER-NeRF, a novel conditional Neural Radiance Fields (NeRF) based
architecture for talking portrait synthesis that can concurrently achieve fast convergence, real …
architecture for talking portrait synthesis that can concurrently achieve fast convergence, real …