Directed diffusion: Direct control of object placement through attention guidance

W Feng, W Zhu, T Fu, V Jampani… - Advances in …, 2024 - proceedings.neurips.cc

Attaining a high degree of user controllability in visual generation often requires intricate,
fine-grained inputs like layouts. However, such inputs impose a substantial burden on users …

被引用次数：171 相关文章所有 7 个版本

[PDF] thecvf.com

Boxdiff: Text-to-image synthesis with training-free box-constrained diffusion

J Xie, Y Li, Y Huang, H Liu, W Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recent text-to-image diffusion models have demonstrated an astonishing capacity to
generate high-quality images. However, researchers mainly studied the way of synthesizing …

被引用次数：160 相关文章所有 8 个版本

[PDF] arxiv.org

Tokenflow: Consistent diffusion features for consistent video editing

M Geyer, O Bar-Tal, S Bagon, T Dekel - arXiv preprint arXiv:2307.10373, 2023 - arxiv.org

The generative AI revolution has recently expanded to videos. Nevertheless, current state-of-
the-art video models are still lagging behind image models in terms of visual quality and …

被引用次数：202 相关文章所有 3 个版本

[PDF] thecvf.com

Expressive text-to-image generation with rich text

S Ge, T Park, JY Zhu, JB Huang - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Plain text has become a prevalent interface for text-to-image synthesis. However, its limited
customization options hinder users from accurately describing desired outputs. For example …

被引用次数：73 相关文章所有 6 个版本

[PDF] thecvf.com

Grounded text-to-image synthesis with attention refocusing

Q Phung, S Ge, JB Huang - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com

Driven by the scalable diffusion models trained on large-scale datasets text-to-image
synthesis methods have shown compelling results. However these models still fail to …

被引用次数：84 相关文章所有 3 个版本

Mix-of-show: Decentralized low-rank adaptation for multi-concept customization of diffusion models

Y Gu, X Wang, JZ Wu, Y Shi, Y Chen… - Advances in …, 2024 - proceedings.neurips.cc

Public large-scale text-to-image diffusion models, such as Stable Diffusion, have gained
significant attention from the community. These models can be easily customized for new …

被引用次数：130 相关文章所有 5 个版本

[PDF] thecvf.com

Space-time diffusion features for zero-shot text-driven motion transfer

D Yatim, R Fridman, O Bar-Tal… - Proceedings of the …, 2024 - openaccess.thecvf.com

We present a new method for text-driven motion transfer-synthesizing a video that complies
with an input text prompt describing the target objects and scene while maintaining an input …

被引用次数：28 相关文章所有 3 个版本

[PDF] thecvf.com

Zero-shot spatial layout conditioning for text-to-image diffusion models

G Couairon, M Careil, M Cord… - Proceedings of the …, 2023 - openaccess.thecvf.com

Large-scale text-to-image diffusion models have significantly improved the state of the art in
generative image modeling and allow for an intuitive and powerful user interface to drive the …

被引用次数：54 相关文章所有 6 个版本

[PDF] aaai.org

Compositional text-to-image synthesis with attention map control of diffusion models

R Wang, Z Chen, C Chen, J Ma, H Lu… - Proceedings of the AAAI …, 2024 - ojs.aaai.org

Recent text-to-image (T2I) diffusion models show outstanding performance in generating
high-quality images conditioned on textual prompts. However, they fail to semantically align …

被引用次数：49 相关文章所有 4 个版本

[PDF] arxiv.org

Unveiling and mitigating memorization in text-to-image diffusion models through cross attention

J Ren, Y Li, S Zeng, H Xu, L Lyu, Y Xing… - European Conference on …, 2024 - Springer

Recent advancements in text-to-image (T2I) diffusion models have demonstrated their
remarkable capability to generate high-quality images from textual prompts. However …

被引用次数：12 相关文章所有 2 个版本