Datasets and benchmarks for offline safe reinforcement learning

J Guan, G Chen, J Ji, L Yang… - Advances in Neural …, 2024 - proceedings.neurips.cc

Offline safe reinforcement learning (RL) algorithms promise to learn policies that satisfy
safety constraints directly in offline datasets without interacting with the environment. This …

被引用次数：10 相关文章所有 4 个版本

[PDF] arxiv.org

Apigen: Automated pipeline for generating verifiable and diverse function-calling datasets

Z Liu, T Hoang, J Zhang, M Zhu, T Lan… - arXiv preprint arXiv …, 2024 - arxiv.org

The advancement of function-calling agent models requires diverse, reliable, and high-
quality datasets. This paper presents APIGen, an automated data generation pipeline …

被引用次数：16 相关文章所有 3 个版本

[PDF] neurips.cc

Constraint-conditioned policy optimization for versatile safe reinforcement learning

Y Yao, Z Liu, Z Cen, J Zhu, W Yu… - Advances in Neural …, 2024 - proceedings.neurips.cc

Safe reinforcement learning (RL) focuses on training reward-maximizing agents subject to
pre-defined safety constraints. Yet, learning versatile safe policies that can adapt to varying …

被引用次数：11 相关文章所有 8 个版本

[PDF] thecvf.com

POCE: Primal Policy Optimization with Conservative Estimation for Multi-constraint Offline Reinforcement Learning

J Guan, L Shen, A Zhou, L Li, H Hu… - Proceedings of the …, 2024 - openaccess.thecvf.com

Multi-constraint offline reinforcement learning (RL) promises to learn policies that satisfy
both cumulative and state-wise costs from offline datasets. This arrangement provides an …

被引用次数：3 相关文章所有 2 个版本

[PDF] neurips.cc

Survival instinct in offline reinforcement learning

A Li, D Misra, A Kolobov… - Advances in neural …, 2024 - proceedings.neurips.cc

We present a novel observation about the behavior of offline reinforcement learning (RL)
algorithms: on many benchmark datasets, offline RL can produce well-performing and safe …

被引用次数：17 相关文章所有 5 个版本

[PDF] arxiv.org

Safe offline reinforcement learning with feasibility-guided diffusion model

Y Zheng, J Li, D Yu, Y Yang, SE Li, X Zhan… - arXiv preprint arXiv …, 2024 - arxiv.org

Safe offline RL is a promising way to bypass risky online interactions towards safe policy
learning. Most existing methods only enforce soft constraints, ie, constraining safety …

被引用次数：21 相关文章所有 4 个版本

[PDF] arxiv.org

A Survey of Constraint Formulations in Safe Reinforcement Learning

A Wachi, X Shen, Y Sui - arXiv preprint arXiv:2402.02025, 2024 - arxiv.org

Ensuring safety is critical when applying reinforcement learning (RL) to real-world problems.
Consequently, safe RL emerges as a fundamental and powerful paradigm for safely …

被引用次数：7 相关文章所有 2 个版本

[PDF] mlr.press

A primal-dual-critic algorithm for offline constrained reinforcement learning

K Hong, Y Li, A Tewari - International Conference on …, 2024 - proceedings.mlr.press

Offline constrained reinforcement learning (RL) aims to learn a policy that maximizes the
expected cumulative reward subject to constraints on expected cumulative cost using an …

被引用次数：9 相关文章所有 5 个版本

Safety-aware causal representation for trustworthy offline reinforcement learning in autonomous driving

H Lin, W Ding, Z Liu, Y Niu, J Zhu… - IEEE Robotics and …, 2024 - ieeexplore.ieee.org

In the domain of autonomous driving, the offline Reinforcement Learning (RL) approaches
exhibit notable efficacy in addressing sequential decision-making problems from offline …

被引用次数：7 相关文章

[PDF] arxiv.org

Oasis: Conditional distribution shaping for offline safe reinforcement learning

Y Yao, Z Cen, W Ding, H Lin, S Liu, T Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

Offline safe reinforcement learning (RL) aims to train a policy that satisfies constraints using
a pre-collected dataset. Most current methods struggle with the mismatch between imperfect …

被引用次数：2 相关文章所有 3 个版本