A survey on model-based reinforcement learning

FM Luo, T Xu, H Lai, XH Chen, W Zhang… - Science China Information …, 2024 - Springer
Reinforcement learning (RL) interacts with the environment to solve sequential decision-
making problems via a trial-and-error approach. Errors are always undesirable in real-world …

A review of deep reinforcement learning approaches for smart manufacturing in industry 4.0 and 5.0 framework

A del Real Torres, DS Andreiana, Á Ojeda Roldán… - Applied Sciences, 2022 - mdpi.com
In this review, the industry's current issues regarding intelligent manufacture are presented.
This work presents the status and the potential for the I4. 0 and I5. 0's revolutionary …

Cooperative exploration for multi-agent deep reinforcement learning

IJ Liu, U Jain, RA Yeh… - … conference on machine …, 2021 - proceedings.mlr.press
Exploration is critical for good results in deep reinforcement learning and has attracted much
attention. However, existing multi-agent deep reinforcement learning algorithms still use …

Dropout q-functions for doubly efficient reinforcement learning

T Hiraoka, T Imagawa, T Hashimoto, T Onishi… - arXiv preprint arXiv …, 2021 - arxiv.org
Randomized ensembled double Q-learning (REDQ)(Chen et al., 2021b) has recently
achieved state-of-the-art sample efficiency on continuous-action reinforcement learning …

Cross-domain policy adaptation via value-guided data filtering

K Xu, C Bai, X Ma, D Wang, B Zhao… - Advances in …, 2023 - proceedings.neurips.cc
Generalizing policies across different domains with dynamics mismatch poses a significant
challenge in reinforcement learning. For example, a robot learns the policy in a simulator …

Live in the moment: Learning dynamics model adapted to evolving policy

X Wang, W Wongkamjan, R Jia… - … on Machine Learning, 2023 - proceedings.mlr.press
Abstract Model-based reinforcement learning (RL) often achieves higher sample efficiency
in practice than model-free RL by learning a dynamics model to generate samples for policy …

Weighted model estimation for offline model-based reinforcement learning

T Hishinuma, K Senda - Advances in neural information …, 2021 - proceedings.neurips.cc
This paper discusses model estimation in offline model-based reinforcement learning
(MBRL), which is important for subsequent policy improvement using an estimated model …

COPlanner: Plan to Roll Out Conservatively but to Explore Optimistically for Model-Based RL

X Wang, R Zheng, Y Sun, R Jia, W Wongkamjan… - arXiv preprint arXiv …, 2023 - arxiv.org
Dyna-style model-based reinforcement learning contains two phases: model rollouts to
generate sample for policy learning and real environment exploration using current policy …

Adaptation augmented model-based policy optimization

J Shen, H Lai, M Liu, H Zhao, Y Yu, W Zhang - Journal of Machine …, 2023 - jmlr.org
Compared to model-free reinforcement learning (RL), model-based RL is often more sample
efficient by leveraging a learned dynamics model to help decision making. However, the …

On effective scheduling of model-based reinforcement learning

H Lai, J Shen, W Zhang, Y Huang… - Advances in …, 2021 - proceedings.neurips.cc
Abstract Model-based reinforcement learning has attracted wide attention due to its superior
sample efficiency. Despite its impressive success so far, it is still unclear how to …