FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis

文章

学术资源搜索

获得 4 条结果（用时0.02秒）

我的图书馆

FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis

在引用文章中搜索

[PDF] arxiv.org

HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts

X Liu, Y He, L Guo, X Li, B Jin, P Li, Y Li… - arXiv preprint arXiv …, 2024 - arxiv.org

The potential for higher-resolution image generation using pretrained diffusion models is
immense, yet these models often struggle with issues of object repetition and structural …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers

P Gao, L Zhuo, Z Lin, C Liu, J Chen, R Du, E Xie… - arXiv preprint arXiv …, 2024 - arxiv.org

Sora unveils the potential of scaling Diffusion Transformer for generating photorealistic
images and videos at arbitrary resolutions, aspect ratios, and durations, yet it still lacks …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning

H Wu, S Shen, Q Hu, X Zhang, Y Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

Diffusion models have emerged as frontrunners in text-to-image generation for their
impressive capabilities. Nonetheless, their fixed image resolution during training often leads …

DiffuseHigh: Training-free Progressive High-Resolution Image Synthesis through Structure Guidance

Y Kim, G Hwang, E Park - arXiv preprint arXiv:2406.18459, 2024 - arxiv.org

Recent surge in large-scale generative models has spurred the development of vast fields in
computer vision. In particular, text-to-image diffusion models have garnered widespread …