Customization assistant for text-to-image generation

J Ma, J Liang, C Chen, H Lu - ACM SIGGRAPH 2024 Conference …, 2024 - dl.acm.org

Recent progress in personalized image generation using diffusion models has been
significant. However, development in the area of open-domain and test-time fine-tuning-free …

被引用次数：79 相关文章所有 3 个版本

[PDF] arxiv.org

Omg: Occlusion-friendly personalized multi-concept generation in diffusion models

Z Kong, Y Zhang, T Yang, T Wang, K Zhang… - … on Computer Vision, 2025 - Springer

Personalization is an important topic in text-to-image generation, especially the challenging
multi-concept personalization. Current multi-concept methods are struggling with identity …

被引用次数：10 相关文章所有 2 个版本

Glyph-byt5: A customized text encoder for accurate visual text rendering

Z Liu, W Liang, Z Liang, C Luo, J Li, G Huang… - … on Computer Vision, 2025 - Springer

Visual text rendering poses a fundamental challenge for contemporary text-to-image
generation models, with the core problem lying in text encoder deficiencies. To achieve …

被引用次数：8 相关文章所有 2 个版本

[PDF] arxiv.org

Controllable generation with text-to-image diffusion models: A survey

P Cao, F Zhou, Q Song, L Yang - arXiv preprint arXiv:2403.04279, 2024 - arxiv.org

In the rapidly advancing realm of visual generation, diffusion models have revolutionized the
landscape, marking a significant shift in capabilities with their impressive text-guided …

被引用次数：19 相关文章所有 2 个版本

[PDF] arxiv.org

LLMs Meet Multimodal Generation and Editing: A Survey

Y He, Z Liu, J Chen, Z Tian, H Liu, X Chi, R Liu… - arXiv preprint arXiv …, 2024 - arxiv.org

With the recent advancement in large language models (LLMs), there is a growing interest in
combining LLMs with multimodal learning. Previous surveys of multimodal large language …

被引用次数：9 相关文章所有 2 个版本

[PDF] arxiv.org

A Survey on Personalized Content Synthesis with Diffusion Models

X Zhang, XY Wei, W Zhang, J Wu, Z Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

Recent advancements in generative models have significantly impacted content creation,
leading to the emergence of Personalized Content Synthesis (PCS). With a small set of user …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

Novel Object Synthesis via Adaptive Text-Image Harmony

Z Xiong, Z Zhang, Z Chen, S Chen, X Li, G Sun… - arXiv preprint arXiv …, 2024 - arxiv.org

In this paper, we study an object synthesis task that combines an object text with an object
image to create a new object image. However, most diffusion models struggle with this …

Toffee: Efficient Million-Scale Dataset Construction for Subject-Driven Text-to-Image Generation

Y Zhou, R Zhang, K Zheng, N Zhao, J Gu… - arXiv preprint arXiv …, 2024 - arxiv.org

In subject-driven text-to-image generation, recent works have achieved superior
performance by training the model on synthetic datasets containing numerous image pairs …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

ControlVAR: Exploring Controllable Visual Autoregressive Modeling

X Li, K Qiu, H Chen, J Kuen, Z Lin, R Singh… - arXiv preprint arXiv …, 2024 - arxiv.org

Conditional visual generation has witnessed remarkable progress with the advent of
diffusion models (DMs), especially in tasks like control-to-image generation. However …

被引用次数：8 相关文章

[PDF] researchgate.net

[PDF][PDF] Long-Term Ad Memorability: Understanding & Generating Memorable Ads

SI Harini, S Singh, Y Kumar, A Bhattacharyya, V Baths… - researchgate.net

Marketers spend billions of dollars on advertisements, but to what end? At purchase time, if
customers cannot recognize the brand for which they saw an ad, the money spent on the ad …