Model-based active exploration

P Ladosz, L Weng, M Kim, H Oh - Information Fusion, 2022 - Elsevier

This paper reviews exploration techniques in deep reinforcement learning. Exploration
techniques are of primary importance when solving sparse reward problems. In sparse …

被引用次数：203 相关文章所有 5 个版本

[PDF] jair.org

Towards continual reinforcement learning: A review and perspectives

K Khetarpal, M Riemer, I Rish, D Precup - Journal of Artificial Intelligence …, 2022 - jair.org

In this article, we aim to provide a literature review of different formulations and approaches
to continual reinforcement learning (RL), also known as lifelong or non-stationary RL. We …

被引用次数：257 相关文章所有 9 个版本

[PDF] royalsocietypublishing.org Full View

Ensemble-SINDy: Robust sparse model discovery in the low-data, high-noise limit, with active learning and control

U Fasel, JN Kutz, BW Brunton… - Proceedings of the …, 2022 - royalsocietypublishing.org

Sparse model identification enables the discovery of nonlinear dynamical systems purely
from data; however, this approach is sensitive to noise, especially in the low-data limit. In this …

被引用次数：189 相关文章所有 9 个版本

[PDF] mlr.press

Planning to explore via self-supervised world models

R Sekar, O Rybkin, K Daniilidis… - International …, 2020 - proceedings.mlr.press

Reinforcement learning allows solving complex tasks, however, the learning tends to be task-
specific and the sample efficiency remains a challenge. We present Plan2Explore, a self …

被引用次数：389 相关文章所有 8 个版本

[PDF] nowpublishers.com

Model-based reinforcement learning: A survey

TM Moerland, J Broekens, A Plaat… - … and Trends® in …, 2023 - nowpublishers.com

Sequential decision making, commonly formalized as Markov Decision Process (MDP)
optimization, is an important challenge in artificial intelligence. Two key approaches to this …

被引用次数：676 相关文章所有 17 个版本

[PDF] mlr.press

Self-supervised exploration via disagreement

D Pathak, D Gandhi, A Gupta - International conference on …, 2019 - proceedings.mlr.press

Efficient exploration is a long-standing problem in sensorimotor learning. Major advances
have been demonstrated in noise-free, non-stochastic domains such as video games and …

被引用次数：399 相关文章所有 6 个版本

[PDF] neurips.cc

Discovering and achieving goals via world models

R Mendonca, O Rybkin, K Daniilidis… - Advances in …, 2021 - proceedings.neurips.cc

How can artificial agents learn to solve many diverse tasks in complex visual environments
without any supervision? We decompose this question into two challenges: discovering new …

被引用次数：110 相关文章所有 13 个版本

[PDF] arxiv.org

Reward model ensembles help mitigate overoptimization

T Coste, U Anwar, R Kirk, D Krueger - arXiv preprint arXiv:2310.02743, 2023 - arxiv.org

Reinforcement learning from human feedback (RLHF) is a standard approach for fine-tuning
large language models to follow instructions. As part of this process, learned reward models …

被引用次数：41 相关文章所有 4 个版本

[PDF] neurips.cc

Semantic exploration from language abstractions and pretrained representations

A Tam, N Rabinowitz, A Lampinen… - Advances in neural …, 2022 - proceedings.neurips.cc

Effective exploration is a challenge in reinforcement learning (RL). Novelty-based
exploration methods can suffer in high-dimensional state spaces, such as continuous …

被引用次数：56 相关文章所有 5 个版本

[PDF] arxiv.org

A survey on intrinsic motivation in reinforcement learning

A Aubret, L Matignon, S Hassas - arXiv preprint arXiv:1908.06976, 2019 - arxiv.org

The reinforcement learning (RL) research area is very active, with an important number of
new contributions; especially considering the emergent field of deep RL (DRL). However a …

被引用次数：177 相关文章所有 3 个版本

Exploration in deep reinforcement learning: A survey