A survey on video diffusion models
The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …
Sora: A review on background, technology, limitations, and opportunities of large vision models
Sora is a text-to-video generative AI model, released by OpenAI in February 2024. The
model is trained to generate videos of realistic or imaginative scenes from text instructions …
model is trained to generate videos of realistic or imaginative scenes from text instructions …
Multimodal foundation models: From specialists to general-purpose assistants
Neural compression is the application of neural networks and other machine learning
methods to data compression. Recent advances in statistical machine learning have opened …
methods to data compression. Recent advances in statistical machine learning have opened …
Panda-70m: Captioning 70m videos with multiple cross-modality teachers
The quality of the data and annotation upper-bounds the quality of a downstream model.
While there exist large text corpora and image-text pairs high-quality video-text data is much …
While there exist large text corpora and image-text pairs high-quality video-text data is much …
Videopoet: A large language model for zero-shot video generation
We present VideoPoet, a language model capable of synthesizing high-quality video, with
matching audio, from a large variety of conditioning signals. VideoPoet employs a decoder …
matching audio, from a large variety of conditioning signals. VideoPoet employs a decoder …
Make pixels dance: High-dynamic video generation
Creating high-dynamic videos such as motion-rich actions and sophisticated visual effects
poses a significant challenge in the field of artificial intelligence. Unfortunately current state …
poses a significant challenge in the field of artificial intelligence. Unfortunately current state …
State of the art on diffusion models for visual computing
The field of visual computing is rapidly advancing due to the emergence of generative
artificial intelligence (AI), which unlocks unprecedented capabilities for the generation …
artificial intelligence (AI), which unlocks unprecedented capabilities for the generation …
When does Sora show: The beginning of TAO to imaginative intelligence and scenarios engineering
During our discussion at workshops for writing “What Does ChatGPT Say: The DAO from
Algorithmic Intelligence to Linguistic Intelligence”[1], we had expected the next milestone for …
Algorithmic Intelligence to Linguistic Intelligence”[1], we had expected the next milestone for …
Dreamgaussian4d: Generative 4d gaussian splatting
Remarkable progress has been made in 4D content generation recently. However, existing
methods suffer from long optimization time, lack of motion controllability, and a low level of …
methods suffer from long optimization time, lack of motion controllability, and a low level of …
Driving into the future: Multiview visual forecasting and planning with world model for autonomous driving
In autonomous driving predicting future events in advance and evaluating the foreseeable
risks empowers autonomous vehicles to plan their actions enhancing safety and efficiency …
risks empowers autonomous vehicles to plan their actions enhancing safety and efficiency …