The uncertainty bellman equation and exploration

P Ladosz, L Weng, M Kim, H Oh - Information Fusion, 2022 - Elsevier

This paper reviews exploration techniques in deep reinforcement learning. Exploration
techniques are of primary importance when solving sparse reward problems. In sparse …

被引用次数：352 相关文章所有 5 个版本

[HTML] sciencedirect.com

[HTML][HTML] A review of uncertainty quantification in deep learning: Techniques, applications and challenges

M Abdar, F Pourpanah, S Hussain, D Rezazadegan… - Information fusion, 2021 - Elsevier

Uncertainty quantification (UQ) methods play a pivotal role in reducing the impact of
uncertainties during both optimization and decision making processes. They have been …

被引用次数：2374 相关文章所有 12 个版本

[PDF] arxiv.org

Offline reinforcement learning: Tutorial, review, and perspectives on open problems

S Levine, A Kumar, G Tucker, J Fu - arXiv preprint arXiv:2005.01643, 2020 - arxiv.org

In this tutorial article, we aim to provide the reader with the conceptual tools needed to get
started on research on offline reinforcement learning algorithms: reinforcement learning …

被引用次数：2104 相关文章所有 3 个版本

[PDF] mlr.press

Off-policy deep reinforcement learning without exploration

S Fujimoto, D Meger, D Precup - … conference on machine …, 2019 - proceedings.mlr.press

Many practical applications of reinforcement learning constrain agents to learn from a fixed
batch of data which has already been gathered, without offering further possibility for data …

被引用次数：1765 相关文章所有 9 个版本

[PDF] ed.ac.uk

Exploration by random network distillation

Y Burda, H Edwards, A Storkey, O Klimov - arXiv preprint arXiv …, 2018 - arxiv.org

We introduce an exploration bonus for deep reinforcement learning methods that is easy to
implement and adds minimal overhead to the computation performed. The bonus is the error …

被引用次数：1558 相关文章所有 10 个版本

[HTML] nature.com

[HTML][HTML] First return, then explore

A Ecoffet, J Huizinga, J Lehman, KO Stanley, J Clune - Nature, 2021 - nature.com

Reinforcement learning promises to solve complex sequential-decision problems
autonomously by specifying a high-level reward function only. However, reinforcement …

被引用次数：414 相关文章所有 10 个版本

[PDF] mlr.press

Addressing function approximation error in actor-critic methods

S Fujimoto, H Hoof, D Meger - International conference on …, 2018 - proceedings.mlr.press

In value-based reinforcement learning methods such as deep Q-learning, function
approximation errors are known to lead to overestimated value estimates and suboptimal …

被引用次数：6518 相关文章所有 8 个版本

[PDF] mdpi.com

Robust reinforcement learning: A review of foundations and recent advances

J Moos, K Hansel, H Abdulsamad, S Stark… - Machine Learning and …, 2022 - mdpi.com

Reinforcement learning (RL) has become a highly successful framework for learning in
Markov decision processes (MDP). Due to the adoption of RL in realistic and complex …

被引用次数：131 相关文章所有 7 个版本

[PDF] arxiv.org

Pessimistic bootstrapping for uncertainty-driven offline reinforcement learning

C Bai, L Wang, Z Yang, Z Deng, A Garg, P Liu… - arXiv preprint arXiv …, 2022 - arxiv.org

Offline Reinforcement Learning (RL) aims to learn policies from previously collected
datasets without exploring the environment. Directly applying off-policy algorithms to offline …

被引用次数：156 相关文章所有 5 个版本

[PDF] arxiv.org

Go-explore: a new approach for hard-exploration problems

A Ecoffet, J Huizinga, J Lehman, KO Stanley… - arXiv preprint arXiv …, 2019 - arxiv.org

A grand challenge in reinforcement learning is intelligent exploration, especially when
rewards are sparse or deceptive. Two Atari games serve as benchmarks for such hard …

被引用次数：460 相关文章所有 2 个版本