Learning skillful medium-range global weather forecasting
Global medium-range weather forecasting is critical to decision-making across many social
and economic domains. Traditional numerical weather prediction uses increased compute …
and economic domains. Traditional numerical weather prediction uses increased compute …
A general theoretical paradigm to understand learning from human preferences
The prevalent deployment of learning from human preferences through reinforcement
learning (RLHF) relies on two important approximations: the first assumes that pairwise …
learning (RLHF) relies on two important approximations: the first assumes that pairwise …
Multi-game decision transformers
A longstanding goal of the field of AI is a method for learning a highly capable, generalist
agent from diverse experience. In the subfields of vision and language, this was largely …
agent from diverse experience. In the subfields of vision and language, this was largely …
Perceiver io: A general architecture for structured inputs & outputs
A central goal of machine learning is the development of systems that can solve many
problems in as many data domains as possible. Current architectures, however, cannot be …
problems in as many data domains as possible. Current architectures, however, cannot be …
Perceiver: General perception with iterative attention
Biological systems understand the world by simultaneously processing high-dimensional
inputs from modalities as diverse as vision, audition, touch, proprioception, etc. The …
inputs from modalities as diverse as vision, audition, touch, proprioception, etc. The …
Training diffusion models with reinforcement learning
Diffusion models are a class of flexible generative models trained with an approximation to
the log-likelihood objective. However, most use cases of diffusion models are not concerned …
the log-likelihood objective. However, most use cases of diffusion models are not concerned …
From data to functa: Your data point is a function and you can treat it like one
It is common practice in deep learning to represent a measurement of the world on a
discrete grid, eg a 2D grid of pixels. However, the underlying signal represented by these …
discrete grid, eg a 2D grid of pixels. However, the underlying signal represented by these …
Dataset distillation with convexified implicit gradients
We propose a new dataset distillation algorithm using reparameterization and
convexification of implicit gradients (RCIG), that substantially improves the state-of-the-art …
convexification of implicit gradients (RCIG), that substantially improves the state-of-the-art …
Velo: Training versatile learned optimizers by scaling up
While deep learning models have replaced hand-designed features across many domains,
these models are still trained with hand-designed optimizers. In this work, we leverage the …
these models are still trained with hand-designed optimizers. In this work, we leverage the …
Deep reinforcement learning with plasticity injection
A growing body of evidence suggests that neural networks employed in deep reinforcement
learning (RL) gradually lose their plasticity, the ability to learn from new data; however, the …
learning (RL) gradually lose their plasticity, the ability to learn from new data; however, the …