A review on deep learning techniques for video prediction
S Oprea, P Martinez-Gonzalez… - … on Pattern Analysis …, 2020 - ieeexplore.ieee.org
The ability to predict, anticipate and reason about future outcomes is a key component of
intelligent decision-making systems. In light of the success of deep learning in computer …
intelligent decision-making systems. In light of the success of deep learning in computer …
Interdiff: Generating 3d human-object interactions with physics-informed diffusion
This paper addresses a novel task of anticipating 3D human-object interactions (HOIs). Most
existing research on HOI synthesis lacks comprehensive whole-body interactions with …
existing research on HOI synthesis lacks comprehensive whole-body interactions with …
Disentangling physical dynamics from unknown factors for unsupervised video prediction
Leveraging physical knowledge described by partial differential equations (PDEs) is an
appealing way to improve unsupervised video forecasting models. Since physics is too …
appealing way to improve unsupervised video forecasting models. Since physics is too …
SINC: Spatial composition of 3D human motions for simultaneous action generation
N Athanasiou, M Petrovich… - Proceedings of the …, 2023 - openaccess.thecvf.com
Our goal is to synthesize 3D human motions given textual inputs describing simultaneous
actions, for examplewaving hand'whilewalking'at the same time. We refer to generating such …
actions, for examplewaving hand'whilewalking'at the same time. We refer to generating such …
Learning multi-object dynamics with compositional neural radiance fields
We present a method to learn compositional multi-object dynamics models from image
observations based on implicit object encoders, Neural Radiance Fields (NeRFs), and …
observations based on implicit object encoders, Neural Radiance Fields (NeRFs), and …
Joint hand motion and interaction hotspots prediction from egocentric videos
We propose to forecast future hand-object interactions given an egocentric video. Instead of
predicting action labels or pixels, we directly predict the hand motion trajectory and the …
predicting action labels or pixels, we directly predict the hand motion trajectory and the …
Slotformer: Unsupervised visual dynamics simulation with object-centric models
Understanding dynamics from visual observations is a challenging problem that requires
disentangling individual objects from the scene and learning their interactions. While recent …
disentangling individual objects from the scene and learning their interactions. While recent …
Infinitenature-zero: Learning perpetual view generation of natural scenes from single images
We present a method for learning to generate unbounded flythrough videos of natural
scenes starting from a single view. This capability is learned from a collection of single …
scenes starting from a single view. This capability is learned from a collection of single …
Greedy hierarchical variational autoencoders for large-scale video prediction
B Wu, S Nair, R Martin-Martin… - Proceedings of the …, 2021 - openaccess.thecvf.com
A video prediction model that generalizes to diverse scenes would enable intelligent agents
such as robots to perform a variety of tasks via planning with the model. However, while …
such as robots to perform a variety of tasks via planning with the model. However, while …
Video prediction recalling long-term motion context via memory alignment learning
Our work addresses long-term motion context issues for predicting future frames. To predict
the future precisely, it is required to capture which long-term motion context (eg, walking or …
the future precisely, it is required to capture which long-term motion context (eg, walking or …