Learning off-policy with online planning

N Hansen, X Wang, H Su - arXiv preprint arXiv:2203.04955, 2022 - arxiv.org

Data-driven model predictive control has two key advantages over model-free methods: a
potential for improved sample efficiency through model learning, and better performance as …

被引用次数：150 相关文章所有 10 个版本

[PDF] neurips.cc

Model-based safe deep reinforcement learning via a constrained proximal policy optimization algorithm

AK Jayant, S Bhatnagar - Advances in Neural Information …, 2022 - proceedings.neurips.cc

During initial iterations of training in most Reinforcement Learning (RL) algorithms, agents
perform a significant number of random exploratory steps. In the real world, this can limit the …

被引用次数：30 相关文章所有 7 个版本

[PDF] arxiv.org

Omnisafe: An infrastructure for accelerating safe reinforcement learning research

J Ji, J Zhou, B Zhang, J Dai, X Pan, R Sun… - arXiv preprint arXiv …, 2023 - arxiv.org

AI systems empowered by reinforcement learning (RL) algorithms harbor the immense
potential to catalyze societal advancement, yet their deployment is often impeded by …

被引用次数：24 相关文章所有 4 个版本

[PDF] arxiv.org

Bridging Evolutionary Algorithms and Reinforcement Learning: A Comprehensive Survey

P Li, J Hao, H Tang, X Fu, Y Zheng, K Tang - arXiv preprint arXiv …, 2024 - arxiv.org

Evolutionary Reinforcement Learning (ERL), which integrates Evolutionary Algorithms (EAs)
and Reinforcement Learning (RL) for optimization, has demonstrated remarkable …

被引用次数：2 相关文章所有 2 个版本

[PDF] mlr.press

Mastering the unsupervised reinforcement learning benchmark from pixels

S Rajeswar, P Mazzaglia, T Verbelen… - International …, 2023 - proceedings.mlr.press

Controlling artificial agents from visual sensory data is an arduous task. Reinforcement
learning (RL) algorithms can succeed but require large amounts of interactions between the …

被引用次数：10 相关文章所有 9 个版本

[PDF] arxiv.org

Safe dreamerv3: Safe reinforcement learning with world models

W Huang, J Ji, B Zhang, C Xia, Y Yang - arXiv preprint arXiv:2307.07176, 2023 - arxiv.org

The widespread application of Reinforcement Learning (RL) in real-world situations is yet to
come to fruition, largely as a result of its failure to satisfy the essential safety demands of …

被引用次数：10 相关文章所有 3 个版本

Bridging Evolutionary Algorithms and Reinforcement Learning: A Comprehensive Survey on Hybrid Algorithms

P Li, J Hao, H Tang, X Fu, Y Zhen… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Evolutionary Reinforcement Learning (ERL), which integrates Evolutionary Algorithms (EAs)
and Reinforcement Learning (RL) for optimization, has demonstrated remarkable …

[PDF] arxiv.org

Dual rl: Unification and new methods for reinforcement and imitation learning

H Sikchi, Q Zheng, A Zhang, S Niekum - arXiv preprint arXiv:2302.08560, 2023 - arxiv.org

The goal of reinforcement learning (RL) is to find a policy that maximizes the expected
cumulative return. It has been shown that this objective can be represented as an …

被引用次数：17 相关文章所有 5 个版本

[PDF] arxiv.org

Reset-free lifelong learning with skill-space planning

K Lu, A Grover, P Abbeel, I Mordatch - arXiv preprint arXiv:2012.03548, 2020 - arxiv.org

The objective of lifelong reinforcement learning (RL) is to optimize agents which can
continuously adapt and interact in changing environments. However, current RL approaches …

被引用次数：40 相关文章所有 4 个版本

[PDF] mlr.press

Predictable mdp abstraction for unsupervised model-based rl

S Park, S Levine - International Conference on Machine …, 2023 - proceedings.mlr.press

A key component of model-based reinforcement learning (RL) is a dynamics model that
predicts the outcomes of actions. Errors in this predictive model can degrade the …

被引用次数：7 相关文章所有 6 个版本