A review of recurrent neural networks: LSTM cells and network architectures
Y Yu, X Si, C Hu, J Zhang - Neural computation, 2019 - direct.mit.edu
Recurrent neural networks (RNNs) have been widely adopted in research areas concerned
with sequential data, such as text, audio, and video. However, RNNs consisting of sigma …
with sequential data, such as text, audio, and video. However, RNNs consisting of sigma …
Image and video compression with neural networks: A review
In recent years, the image and video coding technologies have advanced by leaps and
bounds. However, due to the popularization of image and video acquisition devices, the …
bounds. However, due to the popularization of image and video acquisition devices, the …
Imagen video: High definition video generation with diffusion models
We present Imagen Video, a text-conditional video generation system based on a cascade
of video diffusion models. Given a text prompt, Imagen Video generates high definition …
of video diffusion models. Given a text prompt, Imagen Video generates high definition …
Preserve your own correlation: A noise prior for video diffusion models
Despite tremendous progress in generating high-quality images using diffusion models,
synthesizing a sequence of animated frames that are both photorealistic and temporally …
synthesizing a sequence of animated frames that are both photorealistic and temporally …
Phenaki: Variable length video generation from open domain textual descriptions
We present Phenaki, a model capable of realistic video synthesis given a sequence of
textual prompts. Generating videos from text is particularly challenging due to the …
textual prompts. Generating videos from text is particularly challenging due to the …
Sequential modeling enables scalable learning for large vision models
We introduce a novel sequential modeling approach which enables learning a Large Vision
Model (LVM) without making use of any linguistic data. To do this we define a common …
Model (LVM) without making use of any linguistic data. To do this we define a common …
Simvp: Simpler yet better video prediction
Abstract From CNN, RNN, to ViT, we have witnessed remarkable advancements in video
prediction, incorporating auxiliary inputs, elaborate neural architectures, and sophisticated …
prediction, incorporating auxiliary inputs, elaborate neural architectures, and sophisticated …
Stylegan-v: A continuous video generator with the price, image quality and perks of stylegan2
I Skorokhodov, S Tulyakov… - Proceedings of the …, 2022 - openaccess.thecvf.com
Videos show continuous events, yet most--if not all--video synthesis frameworks treat them
discretely in time. In this work, we think of videos of what they should be--time-continuous …
discretely in time. In this work, we think of videos of what they should be--time-continuous …
Long video generation with time-agnostic vqgan and time-sensitive transformer
Videos are created to express emotion, exchange information, and share experiences.
Video synthesis has intrigued researchers for a long time. Despite the rapid progress driven …
Video synthesis has intrigued researchers for a long time. Despite the rapid progress driven …
Predrnn: A recurrent neural network for spatiotemporal predictive learning
The predictive learning of spatiotemporal sequences aims to generate future images by
learning from the historical context, where the visual dynamics are believed to have modular …
learning from the historical context, where the visual dynamics are believed to have modular …