Inductive biases for deep learning of higher-level cognition

A Goyal, Y Bengio - Proceedings of the Royal Society A, 2022 - royalsocietypublishing.org
A fascinating hypothesis is that human and animal intelligence could be explained by a few
principles (rather than an encyclopaedic list of heuristics). If that hypothesis was correct, we …

Predrnn: A recurrent neural network for spatiotemporal predictive learning

Y Wang, H Wu, J Zhang, Z Gao, J Wang… - … on Pattern Analysis …, 2022 - ieeexplore.ieee.org
The predictive learning of spatiotemporal sequences aims to generate future images by
learning from the historical context, where the visual dynamics are believed to have modular …

Neural production systems

AG ALIAS PARTH GOYAL, A Didolkar… - Advances in …, 2021 - proceedings.neurips.cc
Visual environments are structured, consisting of distinct objects or entities. These entities
have properties---visible or latent---that determine the manner in which they interact with one …

Simone: View-invariant, temporally-abstracted object representations via unsupervised video decomposition

R Kabra, D Zoran, G Erdogan… - Advances in …, 2021 - proceedings.neurips.cc
To help agents reason about scenes in terms of their building blocks, we wish to extract the
compositional structure of any given scene (in particular, the configuration and …

Parts: Unsupervised segmentation with slots, attention and independence maximization

D Zoran, R Kabra, A Lerchner… - Proceedings of the …, 2021 - openaccess.thecvf.com
From an early age, humans perceive the visual world as composed of coherent objects with
distinctive properties such as shape, size, and color. There is great interest in building …

Iso-dream: Isolating and leveraging noncontrollable visual dynamics in world models

M Pan, X Zhu, Y Wang, X Yang - Advances in neural …, 2022 - proceedings.neurips.cc
World models learn the consequences of actions in vision-based interactive systems.
However, in practical scenarios such as autonomous driving, there commonly exists …

Guess what moves: Unsupervised video and image segmentation by anticipating motion

S Choudhury, L Karazija, I Laina, A Vedaldi… - arXiv preprint arXiv …, 2022 - arxiv.org
Motion, measured via optical flow, provides a powerful cue to discover and learn objects in
images and videos. However, compared to using appearance, it has some blind spots, such …

Unsupervised multi-object segmentation by predicting probable motion patterns

L Karazija, S Choudhury, I Laina… - Advances in …, 2022 - proceedings.neurips.cc
We propose a new approach to learn to segment multiple image objects without manual
supervision. The method can extract objects form still images, but uses videos for …

Intrinsic physical concepts discovery with Object-Centric predictive models

Q Tang, X Zhu, Z Lei, Z Zhang - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
The ability to discover abstract physical concepts and understand how they work in the world
through observing lies at the core of human intelligence. The acquisition of this ability is …

Compositional scene representation learning via reconstruction: A survey

J Yuan, T Chen, B Li, X Xue - IEEE Transactions on Pattern …, 2023 - ieeexplore.ieee.org
Visual scenes are composed of visual concepts and have the property of combinatorial
explosion. An important reason for humans to efficiently learn from diverse visual scenes is …