A-star: Test-time attention segregation and retention for text-to-image synthesis

C Li, Z Gan, Z Yang, J Yang, L Li… - … and Trends® in …, 2024 - nowpublishers.com

Neural compression is the application of neural networks and other machine learning
methods to data compression. Recent advances in statistical machine learning have opened …

被引用次数：195 相关文章所有 6 个版本

[PDF] thecvf.com

Grounded text-to-image synthesis with attention refocusing

Q Phung, S Ge, JB Huang - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com

Driven by the scalable diffusion models trained on large-scale datasets text-to-image
synthesis methods have shown compelling results. However these models still fail to …

被引用次数：82 相关文章所有 3 个版本

[PDF] thecvf.com

Focus on your instruction: Fine-grained and multi-instruction image editing by attention modulation

Q Guo, T Lin - Proceedings of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com

Recently diffusion-based methods like InstructPix2Pix (IP2P) have achieved effective
instruction-based image editing requiring only natural language instructions from the user …

被引用次数：17 相关文章所有 3 个版本

[PDF] thecvf.com

Conform: Contrast is all you need for high-fidelity text-to-image diffusion models

THS Meral, E Simsar, F Tombari… - Proceedings of the …, 2024 - openaccess.thecvf.com

Images produced by text-to-image diffusion models might not always faithfully represent the
semantic intent of the provided text prompt where the model might overlook or entirely fail to …

被引用次数：16 相关文章所有 2 个版本

[PDF] thecvf.com

Initno: Boosting text-to-image diffusion models via initial noise optimization

X Guo, J Liu, M Cui, J Li, H Yang… - Proceedings of the …, 2024 - openaccess.thecvf.com

Recent strides in the development of diffusion models exemplified by advancements such as
Stable Diffusion have underscored their remarkable prowess in generating visually …

被引用次数：13 相关文章所有 3 个版本

[PDF] arxiv.org

Not all noises are created equally: Diffusion noise selection and optimization

Z Qi, L Bai, H Xiong, Z Xie - arXiv preprint arXiv:2407.14041, 2024 - arxiv.org

Diffusion models that can generate high-quality data from randomly sampled Gaussian
noises have become the mainstream generative method in both academia and industry. Are …

被引用次数：8 相关文章所有 4 个版本

[PDF] arxiv.org

Object-conditioned energy-based attention map alignment in text-to-image diffusion models

Y Zhang, P Yu, YN Wu - European Conference on Computer Vision, 2025 - Springer

Text-to-image diffusion models have shown great success in generating high-quality text-
guided images. Yet, these models may still fail to semantically align generated images with …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

Improving compositional text-to-image generation with large vision-language models

S Wen, G Fang, R Zhang, P Gao, H Dong… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent advancements in text-to-image models, particularly diffusion models, have shown
significant promise. However, compositional text-to-image models frequently encounter …

被引用次数：12 相关文章所有 3 个版本

[PDF] arxiv.org

MC: Multi-concept Guidance for Customized Multi-concept Generation

J Jiang, Y Zhang, K Feng, X Wu, W Zuo - arXiv preprint arXiv:2404.05268, 2024 - arxiv.org

Customized text-to-image generation aims to synthesize instantiations of user-specified
concepts and has achieved unprecedented progress in handling individual concept …

被引用次数：8 相关文章所有 2 个版本

Enhancing semantic mapping in text-to-image diffusion via Gather-and-Bind

H Fu, G Cheng - Computers & Graphics, 2024 - Elsevier

Text-to-image synthesis is a challenging task that aims to generate realistic and diverse
images from natural language descriptions. However, existing text-to-image diffusion …

被引用次数：1 相关文章