Videocrafter2: Overcoming data limitations for high-quality video diffusion models
Text-to-video generation aims to produce a video based on a given prompt. Recently
several commercial video models have been able to generate plausible videos with minimal …
several commercial video models have been able to generate plausible videos with minimal …
Dynamicrafter: Animating open-domain images with video diffusion priors
Animating a still image offers an engaging visual experience. Traditional image animation
techniques mainly focus on animating natural scenes with stochastic dynamics (eg clouds …
techniques mainly focus on animating natural scenes with stochastic dynamics (eg clouds …
Ssr-encoder: Encoding selective subject representation for subject-driven generation
Recent advancements in subject-driven image generation have led to zero-shot generation
yet precise selection and focus on crucial subject representations remain challenging …
yet precise selection and focus on crucial subject representations remain challenging …
Proxedit: Improving tuning-free real image editing with proximal guidance
DDIM inversion has revealed the remarkable potential of real image editing within diffusion-
based methods. However, the accuracy of DDIM reconstruction degrades as larger classifier …
based methods. However, the accuracy of DDIM reconstruction degrades as larger classifier …
As-Plausible-As-Possible: Plausibility-Aware Mesh Deformation Using 2D Diffusion Priors
Abstract We present As-Plausible-as-Possible (APAP) mesh deformation technique that
leverages 2D diffusion priors to preserve the plausibility of a mesh under user-controlled …
leverages 2D diffusion priors to preserve the plausibility of a mesh under user-controlled …
Prompt Highlighter: Interactive Control for Multi-Modal LLMs
This study targets a critical aspect of multi-modal LLMs'(LLMs&VLMs) inference: explicit
controllable text generation. Multi-modal LLMs empower multi-modality understanding with …
controllable text generation. Multi-modal LLMs empower multi-modality understanding with …
Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models
P Marcos-Manchón, R Alcover-Couso… - Proceedings of the …, 2024 - openaccess.thecvf.com
Diffusion models represent a new paradigm in text-to-image generation. Beyond generating
high-quality images from text prompts models such as Stable Diffusion have been …
high-quality images from text prompts models such as Stable Diffusion have been …
Accelerating diffusion models for inverse problems through shortcut sampling
Diffusion models have recently demonstrated an impressive ability to address inverse
problems in an unsupervised manner. While existing methods primarily focus on modifying …
problems in an unsupervised manner. While existing methods primarily focus on modifying …
ColorizeDiffusion: Adjustable Sketch Colorization with Reference Image and Text
D Yan, L Yuan, Y Nishioka, I Fujishiro… - arXiv preprint arXiv …, 2024 - arxiv.org
Recently, diffusion models have demonstrated their effectiveness in generating extremely
high-quality images and have found wide-ranging applications, including automatic sketch …
high-quality images and have found wide-ranging applications, including automatic sketch …
CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities
Customized video generation aims to generate high-quality videos guided by text prompts
and subject's reference images. However, since it is only trained on static images, the fine …
and subject's reference images. However, since it is only trained on static images, the fine …