Evalcrafter: Benchmarking and evaluating large video generation models

Z Xing, Q Feng, H Chen, Q Dai, H Hu, H Xu… - ACM Computing …, 2023 - dl.acm.org

The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …

被引用次数：50 相关文章所有 3 个版本

[PDF] thecvf.com

Vbench: Comprehensive benchmark suite for video generative models

Z Huang, Y He, J Yu, F Zhang, C Si… - Proceedings of the …, 2024 - openaccess.thecvf.com

Video generation has witnessed significant advancements yet evaluating these models
remains a challenge. A comprehensive evaluation benchmark for video generation is …

被引用次数：69 相关文章所有 4 个版本

[PDF] arxiv.org

Videopoet: A large language model for zero-shot video generation

D Kondratyuk, L Yu, X Gu, J Lezama, J Huang… - arXiv preprint arXiv …, 2023 - arxiv.org

We present VideoPoet, a language model capable of synthesizing high-quality video, with
matching audio, from a large variety of conditioning signals. VideoPoet employs a decoder …

被引用次数：75 相关文章所有 5 个版本

[PDF] thecvf.com

Fairy: Fast parallelized instruction-guided video-to-video synthesis

B Wu, CY Chuang, X Wang, Y Jia… - Proceedings of the …, 2024 - openaccess.thecvf.com

In this paper we introduce Fairy a minimalist yet robust adaptation of image-editing diffusion
models enhancing them for video editing applications. Our approach centers on the concept …

被引用次数：13 相关文章所有 3 个版本

[PDF] arxiv.org

Is sora a world simulator? a comprehensive survey on general world models and beyond

Z Zhu, X Wang, W Zhao, C Min, N Deng, M Dou… - arXiv preprint arXiv …, 2024 - arxiv.org

General world models represent a crucial pathway toward achieving Artificial General
Intelligence (AGI), serving as the cornerstone for various applications ranging from virtual …

被引用次数：12 相关文章所有 3 个版本

[PDF] arxiv.org

Evaluating text-to-visual generation with image-to-text generation

Z Lin, D Pathak, B Li, J Li, X Xia, G Neubig… - arXiv preprint arXiv …, 2024 - arxiv.org

Despite significant progress in generative AI, comprehensive evaluation remains
challenging because of the lack of effective metrics and standardized benchmarks. For …

被引用次数：21 相关文章所有 2 个版本

[PDF] thecvf.com

Evaluating and Improving Compositional Text-to-Visual Generation

B Li, Z Lin, D Pathak, J Li, Y Fei, K Wu… - Proceedings of the …, 2024 - openaccess.thecvf.com

While text-to-visual models now produce photo-realistic images and videos they struggle
with compositional text prompts involving attributes relationships and higher-order …

被引用次数：1 相关文章

[PDF] thecvf.com

GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning

J Lv, Y Huang, M Yan, J Huang, J Liu… - Proceedings of the …, 2024 - openaccess.thecvf.com

Recent advances in text-to-video generation have harnessed the power of diffusion models
to create visually compelling content conditioned on text prompts. However they usually …

被引用次数：6 相关文章所有 4 个版本

[PDF] arxiv.org

Freenoise: Tuning-free longer video diffusion via noise rescheduling

H Qiu, M Xia, Y Zhang, Y He, X Wang, Y Shan… - arXiv preprint arXiv …, 2023 - arxiv.org

With the availability of large-scale video datasets and the advances of diffusion models, text-
driven video generation has achieved substantial progress. However, existing video …

被引用次数：29 相关文章所有 3 个版本

[PDF] thecvf.com

Aigc-vqa: A holistic perception metric for aigc video quality assessment

Y Lu, X Li, B Li, Z Yu, F Guan, X Wang… - Proceedings of the …, 2024 - openaccess.thecvf.com

With the development of generative models such as the diffusion model and auto-regressive
model AI-generated content (AIGC) is experiencing an explosive growth. Moreover existing …

被引用次数：3 相关文章