Hifi tuner: High-fidelity subject-driven fine-tuning for diffusion models

P Cao, F Zhou, Q Song, L Yang - arXiv preprint arXiv:2403.04279, 2024 - arxiv.org

In the rapidly advancing realm of visual generation, diffusion models have revolutionized the
landscape, marking a significant shift in capabilities with their impressive text-guided …

被引用次数：25 相关文章所有 2 个版本

[PDF] arxiv.org

Build-A-Scene: Interactive 3D layout control for diffusion-based image generation

A Eldesokey, P Wonka - arXiv preprint arXiv:2408.14819, 2024 - arxiv.org

We propose a diffusion-based approach for Text-to-Image (T2I) generation with interactive
3D layout control. Layout control has been widely studied to alleviate the shortcomings of …

被引用次数：2 相关文章所有 3 个版本

MasterWeaver: Taming Editability and Face Identity for Personalized Text-to-Image Generation

Y Wei, Z Ji, J Bai, H Zhang, L Zhang, W Zuo - European Conference on …, 2025 - Springer

Abstract Text-to-image (T2I) diffusion models have shown significant success in
personalized text-to-image generation, which aims to generate novel images with human …

A Survey on Personalized Content Synthesis with Diffusion Models

X Zhang, XY Wei, W Zhang, J Wu, Z Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

Recent advancements in generative models have significantly impacted content creation,
leading to the emergence of Personalized Content Synthesis (PCS). With a small set of user …

被引用次数：4 相关文章所有 2 个版本

[PDF] arxiv.org

VLEU: a Method for Automatic Evaluation for Generalizability of Text-to-Image Models

J Cao, Z Zhang, H Wang, KF Wong - arXiv preprint arXiv:2409.14704, 2024 - arxiv.org

Progress in Text-to-Image (T2I) models has significantly improved the generation of images
from textual descriptions. However, existing evaluation metrics do not adequately assess the …