ReVersion: Diffusion-based relation inversion from images

L Han, Y Li, H Zhang, P Milanfar… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recently, diffusion models have achieved remarkable success in text-to-image generation,
enabling the creation of high-quality images from text prompts and various conditions …

被引用次数：148 相关文章所有 9 个版本

[PDF] thecvf.com

Vbench: Comprehensive benchmark suite for video generative models

Z Huang, Y He, J Yu, F Zhang, C Si… - Proceedings of the …, 2024 - openaccess.thecvf.com

Video generation has witnessed significant advancements yet evaluating these models
remains a challenge. A comprehensive evaluation benchmark for video generation is …

被引用次数：69 相关文章所有 4 个版本

[PDF] thecvf.com

Freeu: Free lunch in diffusion u-net

C Si, Z Huang, Y Jiang, Z Liu - Proceedings of the IEEE/CVF …, 2024 - openaccess.thecvf.com

In this paper we uncover the untapped potential of diffusion U-Net which serves as a" free
lunch" that substantially improves the generation quality on the fly. We initially investigate …

被引用次数：56 相关文章所有 3 个版本

[PDF] neurips.cc

Diffusion hyperfeatures: Searching through time and space for semantic correspondence

G Luo, L Dunlap, DH Park… - Advances in Neural …, 2024 - proceedings.neurips.cc

Diffusion models have been shown to be capable of generating high-quality images,
suggesting that they could contain meaningful internal representations. Unfortunately, the …

被引用次数：53 相关文章所有 5 个版本

[PDF] neurips.cc

Visual instruction inversion: Image editing via image prompting

T Nguyen, Y Li, U Ojha, YJ Lee - Advances in Neural …, 2024 - proceedings.neurips.cc

Text-conditioned image editing has emerged as a powerful tool for editing images. However,
in many situations, language can be ambiguous and ineffective in describing specific image …

被引用次数：21 相关文章所有 5 个版本

[PDF] thecvf.com

Videobooth: Diffusion-based video generation with image prompts

Y Jiang, T Wu, S Yang, C Si, D Lin… - Proceedings of the …, 2024 - openaccess.thecvf.com

Text-driven video generation witnesses rapid progress. However merely using text prompts
is not enough to depict the desired subject appearance that accurately aligns with users' …

被引用次数：20 相关文章所有 4 个版本

[PDF] thecvf.com

CoDi-2: In-Context Interleaved and Interactive Any-to-Any Generation

Z Tang, Z Yang, M Khademi, Y Liu… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract We present CoDi-2 a Multimodal Large Language Model (MLLM) for learning in-
context interleaved multimodal representations. By aligning modalities with language for …

被引用次数：18 相关文章所有 3 个版本

[PDF] arxiv.org

Domain-agnostic tuning-encoder for fast personalization of text-to-image models

M Arar, R Gal, Y Atzmon, G Chechik… - SIGGRAPH Asia 2023 …, 2023 - dl.acm.org

Text-to-image (T2I) personalization allows users to guide the creative image generation
process by combining their own visual concepts in natural language prompts. Recently …

被引用次数：48 相关文章所有 3 个版本

[PDF] acm.org

Concept decomposition for visual exploration and inspiration

Y Vinker, A Voynov, D Cohen-Or, A Shamir - ACM Transactions on …, 2023 - dl.acm.org

A creative idea is often born from transforming, combining, and modifying ideas from existing
visual examples capturing various concepts. However, one cannot simply copy the concept …

被引用次数：27 相关文章所有 3 个版本

[PDF] thecvf.com

It's All About Your Sketch: Democratising Sketch Control in Diffusion Models

S Koley, AK Bhunia, D Sekhri, A Sain… - Proceedings of the …, 2024 - openaccess.thecvf.com

This paper unravels the potential of sketches for diffusion models addressing the deceptive
promise of direct sketch control in generative AI. We importantly democratise the process …

被引用次数：8 相关文章所有 4 个版本