Exploration in deep reinforcement learning: A survey

P Ladosz, L Weng, M Kim, H Oh - Information Fusion, 2022 - Elsevier
This paper reviews exploration techniques in deep reinforcement learning. Exploration
techniques are of primary importance when solving sparse reward problems. In sparse …

Hierarchical reinforcement learning: A comprehensive survey

S Pateria, B Subagdja, A Tan, C Quek - ACM Computing Surveys (CSUR …, 2021 - dl.acm.org
Hierarchical Reinforcement Learning (HRL) enables autonomous decomposition of
challenging long-horizon decision-making tasks into simpler subtasks. During the past …

Multi-agent deep reinforcement learning: a survey

S Gronauer, K Diepold - Artificial Intelligence Review, 2022 - Springer
The advances in reinforcement learning have recorded sublime success in various domains.
Although the multi-agent domain has been overshadowed by its single-agent counterpart …

Reinforcement learning based recommender systems: A survey

MM Afsar, T Crump, B Far - ACM Computing Surveys, 2022 - dl.acm.org
Recommender systems (RSs) have become an inseparable part of our everyday lives. They
help us find our favorite items to purchase, our friends on social networks, and our favorite …

A survey of meta-reinforcement learning

J Beck, R Vuorio, EZ Liu, Z Xiong, L Zintgraf… - arXiv preprint arXiv …, 2023 - arxiv.org
While deep reinforcement learning (RL) has fueled multiple high-profile successes in
machine learning, it is held back from more widespread adoption by its often poor data …

Q-learning algorithms: A comprehensive classification and applications

B Jang, M Kim, G Harerimana, JW Kim - IEEE access, 2019 - ieeexplore.ieee.org
Q-learning is arguably one of the most applied representative reinforcement learning
approaches and one of the off-policy strategies. Since the emergence of Q-learning, many …

Model-based reinforcement learning: A survey

TM Moerland, J Broekens, A Plaat… - … and Trends® in …, 2023 - nowpublishers.com
Sequential decision making, commonly formalized as Markov Decision Process (MDP)
optimization, is an important challenge in artificial intelligence. Two key approaches to this …

Goal-conditioned reinforcement learning with imagined subgoals

E Chane-Sane, C Schmid… - … conference on machine …, 2021 - proceedings.mlr.press
Goal-conditioned reinforcement learning endows an agent with a large variety of skills, but it
often struggles to solve tasks that require more temporally extended reasoning. In this work …

Human-level performance in 3D multiplayer games with population-based reinforcement learning

M Jaderberg, WM Czarnecki, I Dunning, L Marris… - Science, 2019 - science.org
Reinforcement learning (RL) has shown great success in increasingly complex single-agent
environments and two-player turn-based games. However, the real world contains multiple …

Dynamics-aware unsupervised discovery of skills

A Sharma, S Gu, S Levine, V Kumar… - arXiv preprint arXiv …, 2019 - arxiv.org
Conventionally, model-based reinforcement learning (MBRL) aims to learn a global model
for the dynamics of the environment. A good model can potentially enable planning …