Glyphcontrol: Glyph conditional control for visual text generation

P Cao, F Zhou, Q Song, L Yang - arXiv preprint arXiv:2403.04279, 2024 - arxiv.org

In the rapidly advancing realm of visual generation, diffusion models have revolutionized the
landscape, marking a significant shift in capabilities with their impressive text-guided …

被引用次数：8 相关文章所有 2 个版本

[PDF] thecvf.com

Towards diverse and consistent typography generation

W Shimoda, D Haraguchi, S Uchida… - Proceedings of the …, 2024 - openaccess.thecvf.com

In this work, we consider the typography generation task that aims at producing diverse
typographic styling for the given graphic document. We formulate typography generation as …

被引用次数：4 相关文章所有 6 个版本

[PDF] arxiv.org

LLMs Meet Multimodal Generation and Editing: A Survey

Y He, Z Liu, J Chen, Z Tian, H Liu, X Chi, R Liu… - arXiv preprint arXiv …, 2024 - arxiv.org

With the recent advancement in large language models (LLMs), there is a growing interest in
combining LLMs with multimodal learning. Previous surveys of multimodal large language …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

Visual Text Meets Low-level Vision: A Comprehensive Survey on Visual Text Processing

Y Shu, W Zeng, Z Li, F Zhao, Y Zhou - arXiv preprint arXiv:2402.03082, 2024 - arxiv.org

Visual text, a pivotal element in both document and scene images, speaks volumes and
attracts significant attention in the computer vision domain. Beyond visual text detection and …

相关文章所有 2 个版本

[PDF] arxiv.org

Refining Text-to-Image Generation: Towards Accurate Training-Free Glyph-Enhanced Image Generation

S Lakhanpal, S Chopra, V Jain, A Chadha… - arXiv preprint arXiv …, 2024 - arxiv.org

Over the past few years, Text-to-Image (T2I) generation approaches based on diffusion
models have gained significant attention. However, vanilla diffusion models often suffer from …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

ARTIST: Improving the Generation of Text-rich Images by Disentanglement

J Zhang, Y Zhou, J Gu, C Wigington, T Yu… - arXiv preprint arXiv …, 2024 - arxiv.org

Diffusion models have demonstrated exceptional capabilities in generating a broad
spectrum of visual content, yet their proficiency in rendering text is still limited: they often …

[PDF] arxiv.org

Text-Animator: Controllable Visual Text Video Generation

L Liu, Q Liu, S Qian, Y Zhou, W Zhou, H Li, L Xie… - arXiv preprint arXiv …, 2024 - arxiv.org

Video generation is a challenging yet pivotal task in various industries, such as gaming, e-
commerce, and advertising. One significant unresolved aspect within T2V is the effective …

相关文章所有 2 个版本

[PDF] arxiv.org

Prompt-Consistency Image Generation (PCIG): A Unified Framework Integrating LLMs, Knowledge Graphs, and Controllable Diffusion Models

Y Sun, Z Chu, Z Qin, K Ren - arXiv preprint arXiv:2406.16333, 2024 - arxiv.org

The rapid advancement of Text-to-Image (T2I) generative models has enabled the synthesis
of high-quality images guided by textual descriptions. Despite this significant progress, these …

相关文章所有 2 个版本

[PDF] arxiv.org

Kinetic Typography Diffusion Model

S Park, I Bae, S Shin, HG Jeon - arXiv preprint arXiv:2407.10476, 2024 - arxiv.org

This paper introduces a method for realistic kinetic typography that generates user-preferred
animatable'text content'. We draw on recent advances in guided video diffusion models to …

相关文章所有 2 个版本

[PDF] arxiv.org

GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models

J Ma, Y Deng, C Chen, H Lu, Z Yang - arXiv preprint arXiv:2407.02252, 2024 - arxiv.org

Posters play a crucial role in marketing and advertising, contributing significantly to industrial
design by enhancing visual communication and brand visibility. With recent advances in …

相关文章所有 2 个版本