Animate-a-story: Storytelling with retrieval-augmented video generation

Z Xing, Q Feng, H Chen, Q Dai, H Hu, H Xu… - ACM Computing …, 2023 - dl.acm.org

The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …

被引用次数：50 相关文章所有 3 个版本

[PDF] arxiv.org

Sora: A review on background, technology, limitations, and opportunities of large vision models

Y Liu, K Zhang, Y Li, Z Yan, C Gao, R Chen… - arXiv preprint arXiv …, 2024 - arxiv.org

Sora is a text-to-video generative AI model, released by OpenAI in February 2024. The
model is trained to generate videos of realistic or imaginative scenes from text instructions …

被引用次数：109 相关文章所有 2 个版本

[PDF] thecvf.com

Vbench: Comprehensive benchmark suite for video generative models

Z Huang, Y He, J Yu, F Zhang, C Si… - Proceedings of the …, 2024 - openaccess.thecvf.com

Video generation has witnessed significant advancements yet evaluating these models
remains a challenge. A comprehensive evaluation benchmark for video generation is …

被引用次数：68 相关文章所有 4 个版本

[PDF] thecvf.com

Generative image dynamics

Z Li, R Tucker, N Snavely… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

We present an approach to modeling an image-space prior on scene motion. Our prior is
learned from a collection of motion trajectories extracted from real video sequences …

被引用次数：37 相关文章所有 9 个版本

[PDF] thecvf.com

Seeing and hearing: Open-domain visual-audio generation with diffusion latent aligners

Y Xing, Y He, Z Tian, X Wang… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Video and audio content creation serves as the core technique for the movie industry and
professional users. Recently existing diffusion-based methods tackle video and audio …

被引用次数：17 相关文章所有 3 个版本

[PDF] thecvf.com

Vlogger: Make your dream a vlog

S Zhuang, K Li, X Chen, Y Wang… - Proceedings of the …, 2024 - openaccess.thecvf.com

In this work we present Vlogger a generic AI system for generating a minute-level video blog
(ie vlog) of user descriptions. Different from short videos with a few seconds vlog often …

被引用次数：14 相关文章所有 3 个版本

[PDF] openreview.net

Scalecrafter: Tuning-free higher-resolution visual generation with diffusion models

Y He, S Yang, H Chen, X Cun, M Xia… - The Twelfth …, 2023 - openreview.net

In this work, we investigate the capability of generating images from pre-trained diffusion
models at much higher resolutions than the training image sizes. In addition, the generated …

被引用次数：24 相关文章所有 3 个版本

[PDF] arxiv.org

Retrieval-augmented generation for ai-generated content: A survey

P Zhao, H Zhang, Q Yu, Z Wang, Y Geng, F Fu… - arXiv preprint arXiv …, 2024 - arxiv.org

The development of Artificial Intelligence Generated Content (AIGC) has been facilitated by
advancements in model algorithms, scalable foundation model architectures, and the …

被引用次数：72 相关文章所有 4 个版本

[PDF] arxiv.org

Videodirectorgpt: Consistent multi-scene video generation via llm-guided planning

H Lin, A Zala, J Cho, M Bansal - arXiv preprint arXiv:2309.15091, 2023 - arxiv.org

Although recent text-to-video (T2V) generation methods have seen significant
advancements, most of these works focus on producing short video clips of a single event …

被引用次数：34 相关文章所有 3 个版本

[PDF] thecvf.com

InstructVideo: instructing video diffusion models with human feedback

H Yuan, S Zhang, X Wang, Y Wei… - Proceedings of the …, 2024 - openaccess.thecvf.com

Diffusion models have emerged as the de facto paradigm for video generation. However
their reliance on web-scale data of varied quality often yields results that are visually …

被引用次数：10 相关文章所有 4 个版本