Inductive biases for deep learning of higher-level cognition
A fascinating hypothesis is that human and animal intelligence could be explained by a few
principles (rather than an encyclopaedic list of heuristics). If that hypothesis was correct, we …
principles (rather than an encyclopaedic list of heuristics). If that hypothesis was correct, we …
Predrnn: A recurrent neural network for spatiotemporal predictive learning
The predictive learning of spatiotemporal sequences aims to generate future images by
learning from the historical context, where the visual dynamics are believed to have modular …
learning from the historical context, where the visual dynamics are believed to have modular …
Neural production systems
AG ALIAS PARTH GOYAL, A Didolkar… - Advances in …, 2021 - proceedings.neurips.cc
Visual environments are structured, consisting of distinct objects or entities. These entities
have properties---visible or latent---that determine the manner in which they interact with one …
have properties---visible or latent---that determine the manner in which they interact with one …
Simone: View-invariant, temporally-abstracted object representations via unsupervised video decomposition
To help agents reason about scenes in terms of their building blocks, we wish to extract the
compositional structure of any given scene (in particular, the configuration and …
compositional structure of any given scene (in particular, the configuration and …
Parts: Unsupervised segmentation with slots, attention and independence maximization
From an early age, humans perceive the visual world as composed of coherent objects with
distinctive properties such as shape, size, and color. There is great interest in building …
distinctive properties such as shape, size, and color. There is great interest in building …
Iso-dream: Isolating and leveraging noncontrollable visual dynamics in world models
World models learn the consequences of actions in vision-based interactive systems.
However, in practical scenarios such as autonomous driving, there commonly exists …
However, in practical scenarios such as autonomous driving, there commonly exists …
Guess what moves: Unsupervised video and image segmentation by anticipating motion
Motion, measured via optical flow, provides a powerful cue to discover and learn objects in
images and videos. However, compared to using appearance, it has some blind spots, such …
images and videos. However, compared to using appearance, it has some blind spots, such …
Unsupervised multi-object segmentation by predicting probable motion patterns
We propose a new approach to learn to segment multiple image objects without manual
supervision. The method can extract objects form still images, but uses videos for …
supervision. The method can extract objects form still images, but uses videos for …
Intrinsic physical concepts discovery with Object-Centric predictive models
The ability to discover abstract physical concepts and understand how they work in the world
through observing lies at the core of human intelligence. The acquisition of this ability is …
through observing lies at the core of human intelligence. The acquisition of this ability is …
Compositional scene representation learning via reconstruction: A survey
Visual scenes are composed of visual concepts and have the property of combinatorial
explosion. An important reason for humans to efficiently learn from diverse visual scenes is …
explosion. An important reason for humans to efficiently learn from diverse visual scenes is …