Layoutgpt: Compositional visual planning and generation with large language models

W Feng, W Zhu, T Fu, V Jampani… - Advances in …, 2024 - proceedings.neurips.cc
Attaining a high degree of user controllability in visual generation often requires intricate,
fine-grained inputs like layouts. However, such inputs impose a substantial burden on users …

Boxdiff: Text-to-image synthesis with training-free box-constrained diffusion

J Xie, Y Li, Y Huang, H Liu, W Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recent text-to-image diffusion models have demonstrated an astonishing capacity to
generate high-quality images. However, researchers mainly studied the way of synthesizing …

Tokenflow: Consistent diffusion features for consistent video editing

M Geyer, O Bar-Tal, S Bagon, T Dekel - arXiv preprint arXiv:2307.10373, 2023 - arxiv.org
The generative AI revolution has recently expanded to videos. Nevertheless, current state-of-
the-art video models are still lagging behind image models in terms of visual quality and …

Expressive text-to-image generation with rich text

S Ge, T Park, JY Zhu, JB Huang - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Plain text has become a prevalent interface for text-to-image synthesis. However, its limited
customization options hinder users from accurately describing desired outputs. For example …

Grounded text-to-image synthesis with attention refocusing

Q Phung, S Ge, JB Huang - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
Driven by the scalable diffusion models trained on large-scale datasets text-to-image
synthesis methods have shown compelling results. However these models still fail to …

Mix-of-show: Decentralized low-rank adaptation for multi-concept customization of diffusion models

Y Gu, X Wang, JZ Wu, Y Shi, Y Chen… - Advances in …, 2024 - proceedings.neurips.cc
Public large-scale text-to-image diffusion models, such as Stable Diffusion, have gained
significant attention from the community. These models can be easily customized for new …

Space-time diffusion features for zero-shot text-driven motion transfer

D Yatim, R Fridman, O Bar-Tal… - Proceedings of the …, 2024 - openaccess.thecvf.com
We present a new method for text-driven motion transfer-synthesizing a video that complies
with an input text prompt describing the target objects and scene while maintaining an input …

Zero-shot spatial layout conditioning for text-to-image diffusion models

G Couairon, M Careil, M Cord… - Proceedings of the …, 2023 - openaccess.thecvf.com
Large-scale text-to-image diffusion models have significantly improved the state of the art in
generative image modeling and allow for an intuitive and powerful user interface to drive the …

Compositional text-to-image synthesis with attention map control of diffusion models

R Wang, Z Chen, C Chen, J Ma, H Lu… - Proceedings of the AAAI …, 2024 - ojs.aaai.org
Recent text-to-image (T2I) diffusion models show outstanding performance in generating
high-quality images conditioned on textual prompts. However, they fail to semantically align …

Unveiling and mitigating memorization in text-to-image diffusion models through cross attention

J Ren, Y Li, S Zeng, H Xu, L Lyu, Y Xing… - European Conference on …, 2024 - Springer
Recent advancements in text-to-image (T2I) diffusion models have demonstrated their
remarkable capability to generate high-quality images from textual prompts. However …