When to update your model: Constrained model-based reinforcement learning

H Zhang, H Yu, J Zhao, D Zhang… - Advances in …, 2024 - proceedings.neurips.cc

Designing and deriving effective model-based reinforcement learning (MBRL) algorithms
with a performance improvement guarantee is challenging, mainly attributed to the high …

被引用次数：4 相关文章所有 5 个版本

[PDF] arxiv.org

Seizing serendipity: Exploiting the value of past success in off-policy actor-critic

T Ji, Y Luo, F Sun, X Zhan, J Zhang, H Xu - arXiv preprint arXiv …, 2023 - arxiv.org

Learning high-quality $ Q $-value functions plays a key role in the success of many modern
off-policy deep reinforcement learning (RL) algorithms. Previous works primarily focus on …

被引用次数：9 相关文章所有 4 个版本

Dyna-style Model-based reinforcement learning with Model-Free Policy Optimization

K Dong, Y Luo, Y Wang, Y Liu, C Qu, Q Zhang… - Knowledge-Based …, 2024 - Elsevier

Dyna-style Model-based reinforcement learning (MBRL) methods have demonstrated
superior sample efficiency compared to their model-free counterparts, largely attributable to …

Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL

Y Luo, T Ji, F Sun, J Zhang, H Xu, X Zhan - arXiv preprint arXiv …, 2024 - arxiv.org

Off-policy reinforcement learning (RL) has achieved notable success in tackling many
complex real-world tasks, by leveraging previously collected data for policy learning …

Understanding world models through multi-step pruning policy via reinforcement learning

Z He, W Qiu, W Zhao, X Shao, Z Liu - Information Sciences, 2025 - Elsevier

In model-based reinforcement learning, the conventional approach to addressing world
model bias is to use gradient optimization methods. However, using a singular policy from …

[PDF] ieee.org

Model-Based Reinforcement Learning with Isolated Imaginations

M Pan, X Zhu, Y Zheng, Y Wang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

World models learn the consequences of actions in vision-based interactive systems.
However, in practical scenarios like autonomous driving, noncontrollable dynamics that are …

被引用次数：1 相关文章所有 7 个版本

[PDF] arxiv.org