Gligen: Open-set grounded text-to-image generation
Large-scale text-to-image diffusion models have made amazing advances. However, the
status quo is to use text input alone, which can impede controllability. In this work, we …
status quo is to use text input alone, which can impede controllability. In this work, we …
Latte: Latent diffusion transformer for video generation
We propose a novel Latent Diffusion Transformer, namely Latte, for video generation. Latte
first extracts spatio-temporal tokens from input videos and then adopts a series of …
first extracts spatio-temporal tokens from input videos and then adopts a series of …
Enhancing detail preservation for customized text-to-image generation: A regularization-free approach
Recent text-to-image generation models have demonstrated impressive capability of
generating text-aligned images with high fidelity. However, generating images of novel …
generating text-aligned images with high fidelity. However, generating images of novel …
Parrot: Pareto-optimal multi-reward reinforcement learning framework for text-to-image generation
Recent works have demonstrated that using reinforcement learning (RL) with multiple
quality rewards can improve the quality of generated images in text-to-image (T2I) …
quality rewards can improve the quality of generated images in text-to-image (T2I) …
Renaissance: A survey into ai text-to-image generation in the era of large model
Text-to-image generation (TTI) refers to the usage of models that could process text input
and generate high fidelity images based on text descriptions. Text-to-image generation …
and generate high fidelity images based on text descriptions. Text-to-image generation …
Exploiting the signal-leak bias in diffusion models
MN Everaert, A Fitsios, M Bocchio… - Proceedings of the …, 2024 - openaccess.thecvf.com
There is a bias in the inference pipeline of most diffusion models. This bias arises from a
signal leak whose distribution deviates from the noise distribution, creating a discrepancy …
signal leak whose distribution deviates from the noise distribution, creating a discrepancy …
Conceptlab: Creative generation using diffusion prior constraints
Recent text-to-image generative models have enabled us to transform our words into
vibrant, captivating imagery. The surge of personalization techniques that has followed has …
vibrant, captivating imagery. The surge of personalization techniques that has followed has …
SkipDiff: Adaptive Skip Diffusion Model for High-Fidelity Perceptual Image Super-resolution
It is well-known that image quality assessment usually meets with the problem of perception-
distortion (pd) tradeoff. The existing deep image super-resolution (SR) methods either focus …
distortion (pd) tradeoff. The existing deep image super-resolution (SR) methods either focus …
Texsliders: Diffusion-based texture editing in clip space
Generative models have enabled intuitive image creation and manipulation using natural
language. In particular, diffusion models have recently shown remarkable results for natural …
language. In particular, diffusion models have recently shown remarkable results for natural …
A survey of diffusion based image generation models: Issues and their solutions
T Zhang, Z Wang, J Huang, MM Tasnim… - arXiv preprint arXiv …, 2023 - arxiv.org
Recently, there has been significant progress in the development of large models. Following
the success of ChatGPT, numerous language models have been introduced, demonstrating …
the success of ChatGPT, numerous language models have been introduced, demonstrating …