Object-centric video prediction without annotation

G Le Moing, J Ponce, C Schmid - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

This paper presents WALDO (WArping Layer-Decomposed Objects), a novel approach to
the prediction of future video frames from past ones. Individual images are decomposed into …

被引用次数：4 相关文章所有 6 个版本

[PDF] aaai.org

Learn the Force We Can: Enabling Sparse Motion Control in Multi-Object Video Generation

A Davtyan, P Favaro - Proceedings of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org

We propose a novel unsupervised method to autoregressively generate videos from a single
frame and a sparse motion input. Our trained model can generate unseen realistic object-to …

被引用次数：3 相关文章所有 4 个版本

[PDF] arxiv.org

Forecasting Future Videos from Novel Views via Disentangled 3D Scene Representation

S Yarram, J Yuan - arXiv preprint arXiv:2407.21450, 2024 - arxiv.org

Video extrapolation in space and time (VEST) enables viewers to forecast a 3D scene into
the future and view it from novel viewpoints. Recent methods propose to learn an entangled …

Learning Physical Dynamics for Object-centric Visual Prediction

H Xu, T Chen, F Xu - arXiv preprint arXiv:2403.10079, 2024 - arxiv.org

The ability to model the underlying dynamics of visual scenes and reason about the future is
central to human intelligence. Many attempts have been made to empower intelligent …

Learning Perceptual Prediction: Learning From Humans and Reasoning About Objects

K Schmeckpeper - 2023 - search.proquest.com

Abstract Reasoning about the results of their actions is a critical skill for embodied agents. In
this thesis, we study how robots can learn to predict the future from visual observations. We …

[PDF] stanford.edu

[PDF][PDF] Self-supervised implicit shape reconstruction and pose estimation for predicting the future

D Patino, K Schmeckpeper, H Gupta, G Georgakis… - neural-implicit-workshop.stanford …

We present our method for efficiently learning an implicit neural representation for shape
reconstruction and pose estimation from raw sensor data. In contrast to recent methods, we …