The DeepMind JAX Ecosystem, 2020

R Lam, A Sanchez-Gonzalez, M Willson, P Wirnsberger… - Science, 2023 - science.org

Global medium-range weather forecasting is critical to decision-making across many social
and economic domains. Traditional numerical weather prediction uses increased compute …

被引用次数：209 相关文章所有 4 个版本

[PDF] mlr.press

A general theoretical paradigm to understand learning from human preferences

MG Azar, ZD Guo, B Piot, R Munos… - International …, 2024 - proceedings.mlr.press

The prevalent deployment of learning from human preferences through reinforcement
learning (RLHF) relies on two important approximations: the first assumes that pairwise …

被引用次数：128 相关文章所有 4 个版本

[PDF] neurips.cc

Multi-game decision transformers

KH Lee, O Nachum, MS Yang, L Lee… - Advances in …, 2022 - proceedings.neurips.cc

A longstanding goal of the field of AI is a method for learning a highly capable, generalist
agent from diverse experience. In the subfields of vision and language, this was largely …

被引用次数：178 相关文章所有 10 个版本

[PDF] arxiv.org

Perceiver io: A general architecture for structured inputs & outputs

A Jaegle, S Borgeaud, JB Alayrac, C Doersch… - arXiv preprint arXiv …, 2021 - arxiv.org

A central goal of machine learning is the development of systems that can solve many
problems in as many data domains as possible. Current architectures, however, cannot be …

被引用次数：495 相关文章所有 4 个版本

[PDF] mlr.press

Perceiver: General perception with iterative attention

A Jaegle, F Gimeno, A Brock… - International …, 2021 - proceedings.mlr.press

Biological systems understand the world by simultaneously processing high-dimensional
inputs from modalities as diverse as vision, audition, touch, proprioception, etc. The …

被引用次数：820 相关文章所有 7 个版本

[PDF] arxiv.org

Training diffusion models with reinforcement learning

K Black, M Janner, Y Du, I Kostrikov… - arXiv preprint arXiv …, 2023 - arxiv.org

Diffusion models are a class of flexible generative models trained with an approximation to
the log-likelihood objective. However, most use cases of diffusion models are not concerned …

被引用次数：96 相关文章所有 6 个版本

[PDF] arxiv.org

From data to functa: Your data point is a function and you can treat it like one

E Dupont, H Kim, SM Eslami, D Rezende… - arXiv preprint arXiv …, 2022 - arxiv.org

It is common practice in deep learning to represent a measurement of the world on a
discrete grid, eg a 2D grid of pixels. However, the underlying signal represented by these …

被引用次数：145 相关文章所有 5 个版本

[PDF] mlr.press

Dataset distillation with convexified implicit gradients

N Loo, R Hasani, M Lechner… - … Conference on Machine …, 2023 - proceedings.mlr.press

We propose a new dataset distillation algorithm using reparameterization and
convexification of implicit gradients (RCIG), that substantially improves the state-of-the-art …

被引用次数：31 相关文章所有 6 个版本

[PDF] arxiv.org

Velo: Training versatile learned optimizers by scaling up

L Metz, J Harrison, CD Freeman, A Merchant… - arXiv preprint arXiv …, 2022 - arxiv.org

While deep learning models have replaced hand-designed features across many domains,
these models are still trained with hand-designed optimizers. In this work, we leverage the …

被引用次数：48 相关文章所有 2 个版本

[PDF] neurips.cc

Deep reinforcement learning with plasticity injection

E Nikishin, J Oh, G Ostrovski, C Lyle… - Advances in …, 2024 - proceedings.neurips.cc

A growing body of evidence suggests that neural networks employed in deep reinforcement
learning (RL) gradually lose their plasticity, the ability to learn from new data; however, the …

被引用次数：19 相关文章所有 6 个版本