A dissection of overfitting and generalization in continuous reinforcement learning

L Gao, J Schulman, J Hilton - International Conference on …, 2023 - proceedings.mlr.press

In reinforcement learning from human feedback, it is common to optimize against a reward
model trained to predict human preferences. Because the reward model is an imperfect …

被引用次数：316 相关文章所有 7 个版本

[PDF] jair.org Full View

A survey of zero-shot generalisation in deep reinforcement learning

R Kirk, A Zhang, E Grefenstette, T Rocktäschel - Journal of Artificial …, 2023 - jair.org

The study of zero-shot generalisation (ZSG) in deep Reinforcement Learning (RL) aims to
produce RL algorithms whose policies generalise well to novel unseen situations at …

被引用次数：366 相关文章所有 9 个版本

[HTML] nih.gov

Deep transfer learning approaches for Monkeypox disease diagnosis

MM Ahsan, MR Uddin, MS Ali, MK Islam… - Expert Systems with …, 2023 - Elsevier

Monkeypox has become a significant global challenge as the number of cases increases
daily. Those infected with the disease often display various skin symptoms and can spread …

被引用次数：81 相关文章所有 6 个版本

[PDF] neurips.cc

Mopo: Model-based offline policy optimization

T Yu, G Thomas, L Yu, S Ermon… - Advances in …, 2020 - proceedings.neurips.cc

Offline reinforcement learning (RL) refers to the problem of learning policies entirely from a
batch of previously collected data. This problem setting is compelling, because it offers the …

被引用次数：822 相关文章所有 11 个版本

[PDF] mlr.press

Leveraging procedural generation to benchmark reinforcement learning

K Cobbe, C Hesse, J Hilton… - … conference on machine …, 2020 - proceedings.mlr.press

Abstract We introduce Procgen Benchmark, a suite of 16 procedurally generated game-like
environments designed to benchmark both sample efficiency and generalization in …

被引用次数：592 相关文章所有 6 个版本

[PDF] nowpublishers.com

An introduction to deep reinforcement learning

V François-Lavet, P Henderson, R Islam… - … and Trends® in …, 2018 - nowpublishers.com

Deep reinforcement learning is the combination of reinforcement learning (RL) and deep
learning. This field of research has been able to solve a wide range of complex …

被引用次数：1860 相关文章所有 16 个版本

[PDF] arxiv.org

Causal reinforcement learning: A survey

Z Deng, J Jiang, G Long, C Zhang - arXiv preprint arXiv:2307.01452, 2023 - arxiv.org

Reinforcement learning is an essential paradigm for solving sequential decision problems
under uncertainty. Despite many remarkable achievements in recent decades, applying …

被引用次数：15 相关文章所有 5 个版本

[PDF] mlr.press

Evolving curricula with regret-based environment design

J Parker-Holder, M Jiang, M Dennis… - International …, 2022 - proceedings.mlr.press

Training generally-capable agents with reinforcement learning (RL) remains a significant
challenge. A promising avenue for improving the robustness of RL agents is through the use …

被引用次数：113 相关文章所有 5 个版本

[PDF] arxiv.org

On the measure of intelligence

F Chollet - arXiv preprint arXiv:1911.01547, 2019 - arxiv.org

To make deliberate progress towards more intelligent and more human-like artificial
systems, we need to be following an appropriate feedback signal: we need to be able to …

被引用次数：618 相关文章所有 5 个版本

[PDF] arxiv.org

Contrastive behavioral similarity embeddings for generalization in reinforcement learning

R Agarwal, MC Machado, PS Castro… - arXiv preprint arXiv …, 2021 - arxiv.org

Reinforcement learning methods trained on few environments rarely learn policies that
generalize to unseen environments. To improve generalization, we incorporate the inherent …

被引用次数：200 相关文章所有 11 个版本