Controllable generation with text-to-image diffusion models: A survey

P Cao, F Zhou, Q Song, L Yang - arXiv preprint arXiv:2403.04279, 2024 - arxiv.org
In the rapidly advancing realm of visual generation, diffusion models have revolutionized the
landscape, marking a significant shift in capabilities with their impressive text-guided …

Follow-Your-Emoji: Fine-Controllable and Expressive Freestyle Portrait Animation

Y Ma, H Liu, H Wang, H Pan, Y He, J Yuan… - arXiv preprint arXiv …, 2024 - arxiv.org
We present Follow-Your-Emoji, a diffusion-based framework for portrait animation, which
animates a reference portrait with target landmark sequences. The main challenge of portrait …

A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models

X Shuai, H Ding, X Ma, R Tu, YG Jiang… - arXiv preprint arXiv …, 2024 - arxiv.org
Image editing aims to edit the given synthetic or real image to meet the specific requirements
from users. It is widely studied in recent years as a promising and challenging field of …

ID-Animator: Zero-Shot Identity-Preserving Human Video Generation

X He, Q Liu, S Qian, X Wang, T Hu, K Cao… - arXiv preprint arXiv …, 2024 - arxiv.org
Generating high fidelity human video with specified identities has attracted significant
attention in the content generation community. However, existing techniques struggle to …

A Survey on Personalized Content Synthesis with Diffusion Models

X Zhang, XY Wei, W Zhang, J Wu, Z Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent advancements in generative models have significantly impacted content creation,
leading to the emergence of Personalized Content Synthesis (PCS). With a small set of user …

[PDF][PDF] Conditional Video Generation Guided by Multimodal Inputs: A Comprehensive Survey

K Niu, W Liu, N Sharif, D Zhu - 2024 - researchgate.net
The field of video generation is rapidly evolving, driven by advancements in generative
models. This survey provides a comprehensive analysis of the diverse methodologies …