Exploration in deep reinforcement learning: A survey

P Ladosz, L Weng, M Kim, H Oh - Information Fusion, 2022 - Elsevier
This paper reviews exploration techniques in deep reinforcement learning. Exploration
techniques are of primary importance when solving sparse reward problems. In sparse …

Towards continual reinforcement learning: A review and perspectives

K Khetarpal, M Riemer, I Rish, D Precup - Journal of Artificial Intelligence …, 2022 - jair.org
In this article, we aim to provide a literature review of different formulations and approaches
to continual reinforcement learning (RL), also known as lifelong or non-stationary RL. We …

Ensemble-SINDy: Robust sparse model discovery in the low-data, high-noise limit, with active learning and control

U Fasel, JN Kutz, BW Brunton… - Proceedings of the …, 2022 - royalsocietypublishing.org
Sparse model identification enables the discovery of nonlinear dynamical systems purely
from data; however, this approach is sensitive to noise, especially in the low-data limit. In this …

Planning to explore via self-supervised world models

R Sekar, O Rybkin, K Daniilidis… - International …, 2020 - proceedings.mlr.press
Reinforcement learning allows solving complex tasks, however, the learning tends to be task-
specific and the sample efficiency remains a challenge. We present Plan2Explore, a self …

Model-based reinforcement learning: A survey

TM Moerland, J Broekens, A Plaat… - … and Trends® in …, 2023 - nowpublishers.com
Sequential decision making, commonly formalized as Markov Decision Process (MDP)
optimization, is an important challenge in artificial intelligence. Two key approaches to this …

Self-supervised exploration via disagreement

D Pathak, D Gandhi, A Gupta - International conference on …, 2019 - proceedings.mlr.press
Efficient exploration is a long-standing problem in sensorimotor learning. Major advances
have been demonstrated in noise-free, non-stochastic domains such as video games and …

Discovering and achieving goals via world models

R Mendonca, O Rybkin, K Daniilidis… - Advances in …, 2021 - proceedings.neurips.cc
How can artificial agents learn to solve many diverse tasks in complex visual environments
without any supervision? We decompose this question into two challenges: discovering new …

Reward model ensembles help mitigate overoptimization

T Coste, U Anwar, R Kirk, D Krueger - arXiv preprint arXiv:2310.02743, 2023 - arxiv.org
Reinforcement learning from human feedback (RLHF) is a standard approach for fine-tuning
large language models to follow instructions. As part of this process, learned reward models …

Semantic exploration from language abstractions and pretrained representations

A Tam, N Rabinowitz, A Lampinen… - Advances in neural …, 2022 - proceedings.neurips.cc
Effective exploration is a challenge in reinforcement learning (RL). Novelty-based
exploration methods can suffer in high-dimensional state spaces, such as continuous …

A survey on intrinsic motivation in reinforcement learning

A Aubret, L Matignon, S Hassas - arXiv preprint arXiv:1908.06976, 2019 - arxiv.org
The reinforcement learning (RL) research area is very active, with an important number of
new contributions; especially considering the emergent field of deep RL (DRL). However a …