Controllable generation with text-to-image diffusion models: A survey
In the rapidly advancing realm of visual generation, diffusion models have revolutionized the
landscape, marking a significant shift in capabilities with their impressive text-guided …
landscape, marking a significant shift in capabilities with their impressive text-guided …
Towards diverse and consistent typography generation
In this work, we consider the typography generation task that aims at producing diverse
typographic styling for the given graphic document. We formulate typography generation as …
typographic styling for the given graphic document. We formulate typography generation as …
LLMs Meet Multimodal Generation and Editing: A Survey
With the recent advancement in large language models (LLMs), there is a growing interest in
combining LLMs with multimodal learning. Previous surveys of multimodal large language …
combining LLMs with multimodal learning. Previous surveys of multimodal large language …
Visual Text Meets Low-level Vision: A Comprehensive Survey on Visual Text Processing
Visual text, a pivotal element in both document and scene images, speaks volumes and
attracts significant attention in the computer vision domain. Beyond visual text detection and …
attracts significant attention in the computer vision domain. Beyond visual text detection and …
Refining Text-to-Image Generation: Towards Accurate Training-Free Glyph-Enhanced Image Generation
Over the past few years, Text-to-Image (T2I) generation approaches based on diffusion
models have gained significant attention. However, vanilla diffusion models often suffer from …
models have gained significant attention. However, vanilla diffusion models often suffer from …
ARTIST: Improving the Generation of Text-rich Images by Disentanglement
Diffusion models have demonstrated exceptional capabilities in generating a broad
spectrum of visual content, yet their proficiency in rendering text is still limited: they often …
spectrum of visual content, yet their proficiency in rendering text is still limited: they often …
Text-Animator: Controllable Visual Text Video Generation
Video generation is a challenging yet pivotal task in various industries, such as gaming, e-
commerce, and advertising. One significant unresolved aspect within T2V is the effective …
commerce, and advertising. One significant unresolved aspect within T2V is the effective …
Prompt-Consistency Image Generation (PCIG): A Unified Framework Integrating LLMs, Knowledge Graphs, and Controllable Diffusion Models
The rapid advancement of Text-to-Image (T2I) generative models has enabled the synthesis
of high-quality images guided by textual descriptions. Despite this significant progress, these …
of high-quality images guided by textual descriptions. Despite this significant progress, these …
Kinetic Typography Diffusion Model
This paper introduces a method for realistic kinetic typography that generates user-preferred
animatable'text content'. We draw on recent advances in guided video diffusion models to …
animatable'text content'. We draw on recent advances in guided video diffusion models to …
GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models
J Ma, Y Deng, C Chen, H Lu, Z Yang - arXiv preprint arXiv:2407.02252, 2024 - arxiv.org
Posters play a crucial role in marketing and advertising, contributing significantly to industrial
design by enhancing visual communication and brand visibility. With recent advances in …
design by enhancing visual communication and brand visibility. With recent advances in …