Learning skillful medium-range global weather forecasting

R Lam, A Sanchez-Gonzalez, M Willson, P Wirnsberger… - Science, 2023 - science.org
Global medium-range weather forecasting is critical to decision-making across many social
and economic domains. Traditional numerical weather prediction uses increased compute …

A general theoretical paradigm to understand learning from human preferences

MG Azar, ZD Guo, B Piot, R Munos… - International …, 2024 - proceedings.mlr.press
The prevalent deployment of learning from human preferences through reinforcement
learning (RLHF) relies on two important approximations: the first assumes that pairwise …

Multi-game decision transformers

KH Lee, O Nachum, MS Yang, L Lee… - Advances in …, 2022 - proceedings.neurips.cc
A longstanding goal of the field of AI is a method for learning a highly capable, generalist
agent from diverse experience. In the subfields of vision and language, this was largely …

Perceiver io: A general architecture for structured inputs & outputs

A Jaegle, S Borgeaud, JB Alayrac, C Doersch… - arXiv preprint arXiv …, 2021 - arxiv.org
A central goal of machine learning is the development of systems that can solve many
problems in as many data domains as possible. Current architectures, however, cannot be …

Perceiver: General perception with iterative attention

A Jaegle, F Gimeno, A Brock… - International …, 2021 - proceedings.mlr.press
Biological systems understand the world by simultaneously processing high-dimensional
inputs from modalities as diverse as vision, audition, touch, proprioception, etc. The …

Training diffusion models with reinforcement learning

K Black, M Janner, Y Du, I Kostrikov… - arXiv preprint arXiv …, 2023 - arxiv.org
Diffusion models are a class of flexible generative models trained with an approximation to
the log-likelihood objective. However, most use cases of diffusion models are not concerned …

From data to functa: Your data point is a function and you can treat it like one

E Dupont, H Kim, SM Eslami, D Rezende… - arXiv preprint arXiv …, 2022 - arxiv.org
It is common practice in deep learning to represent a measurement of the world on a
discrete grid, eg a 2D grid of pixels. However, the underlying signal represented by these …

Dataset distillation with convexified implicit gradients

N Loo, R Hasani, M Lechner… - … Conference on Machine …, 2023 - proceedings.mlr.press
We propose a new dataset distillation algorithm using reparameterization and
convexification of implicit gradients (RCIG), that substantially improves the state-of-the-art …

Velo: Training versatile learned optimizers by scaling up

L Metz, J Harrison, CD Freeman, A Merchant… - arXiv preprint arXiv …, 2022 - arxiv.org
While deep learning models have replaced hand-designed features across many domains,
these models are still trained with hand-designed optimizers. In this work, we leverage the …

Deep reinforcement learning with plasticity injection

E Nikishin, J Oh, G Ostrovski, C Lyle… - Advances in …, 2024 - proceedings.neurips.cc
A growing body of evidence suggests that neural networks employed in deep reinforcement
learning (RL) gradually lose their plasticity, the ability to learn from new data; however, the …