Kubric: A scalable dataset generator

K Greff, F Belletti, L Beyer, C Doersch… - Proceedings of the …, 2022 - openaccess.thecvf.com
Data is the driving force of machine learning, with the amount and quality of training data
often being more important for the performance of a system than architecture and training …

Savi++: Towards end-to-end object-centric learning from real-world videos

G Elsayed, A Mahendran… - Advances in …, 2022 - proceedings.neurips.cc
The visual world can be parsimoniously characterized in terms of distinct entities with sparse
interactions. Discovering this compositional structure in dynamic visual scenes has proven …

Conditional object-centric learning from video

T Kipf, GF Elsayed, A Mahendran, A Stone… - arXiv preprint arXiv …, 2021 - arxiv.org
Object-centric representations are a promising path toward more systematic generalization
by providing flexible abstractions upon which compositional world models can be built …

Object scene representation transformer

MSM Sajjadi, D Duckworth… - Advances in neural …, 2022 - proceedings.neurips.cc
A compositional understanding of the world in terms of objects and their geometry in 3D
space is considered a cornerstone of human cognition. Facilitating the learning of such a …

Simple unsupervised object-centric learning for complex and naturalistic videos

G Singh, YF Wu, S Ahn - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Unsupervised object-centric learning aims to represent the modular, compositional, and
causal structure of a scene as a set of object representations and thereby promises to …

Illiterate dall-e learns to compose

G Singh, F Deng, S Ahn - arXiv preprint arXiv:2110.11405, 2021 - arxiv.org
Although DALL-E has shown an impressive ability of composition-based systematic
generalization in image generation, it requires the dataset of text-image pairs and the …

Object discovery and representation networks

OJ Hénaff, S Koppula, E Shelhamer, D Zoran… - European conference on …, 2022 - Springer
The promise of self-supervised learning (SSL) is to leverage large amounts of unlabeled
data to solve complex tasks. While there has been excellent progress with simple, image …

Towards unsupervised object detection from lidar point clouds

L Zhang, AJ Yang, Y Xiong, S Casas… - Proceedings of the …, 2023 - openaccess.thecvf.com
In this paper, we study the problem of unsupervised object detection from 3D point clouds in
self-driving scenes. We present a simple yet effective method that exploits (i) point clustering …

Slotformer: Unsupervised visual dynamics simulation with object-centric models

Z Wu, N Dvornik, K Greff, T Kipf, A Garg - arXiv preprint arXiv:2210.05861, 2022 - arxiv.org
Understanding dynamics from visual observations is a challenging problem that requires
disentangling individual objects from the scene and learning their interactions. While recent …

Decomposing 3d scenes into objects via unsupervised volume segmentation

K Stelzner, K Kersting, AR Kosiorek - arXiv preprint arXiv:2104.01148, 2021 - arxiv.org
We present ObSuRF, a method which turns a single image of a scene into a 3D model
represented as a set of Neural Radiance Fields (NeRFs), with each NeRF corresponding to …