Look beneath the surface: Exploiting fundamental symmetry for sample-efficient offline rl

P Cheng, X Zhan, W Zhang, Y Lin… - Advances in Neural …, 2024 - proceedings.neurips.cc
Offline reinforcement learning (RL) offers an appealing approach to real-world tasks by
learning policies from pre-collected datasets without interacting with the environment …

Exploration and anti-exploration with distributional random network distillation

K Yang, J Tao, J Lyu, X Li - arXiv preprint arXiv:2401.09750, 2024 - arxiv.org
Exploration remains a critical issue in deep reinforcement learning for an agent to attain high
returns in unknown environments. Although the prevailing exploration Random Network …

[HTML][HTML] Possibilities of reinforcement learning for nuclear power plants: Evidence on current applications and beyond

A Gong, Y Chen, J Zhang, X Li - Nuclear Engineering and Technology, 2024 - Elsevier
Nuclear energy plays a crucial role in energy supply in the 21st century, and more and more
Nuclear Power Plants (NPPs) will be in operation to contribute to the development of human …

Balancing therapeutic effect and safety in ventilator parameter recommendation: An offline reinforcement learning approach

B Zhang, X Qiu, X Tan - Engineering Applications of Artificial Intelligence, 2024 - Elsevier
Reinforcement learning (RL) is increasingly applied in recommending ventilator parameters,
yet existing methods prioritize therapeutic effect over patient safety. This leads to excessive …

A reliable representation with bidirectional transition model for visual reinforcement learning generalization

X Hu, Y Lin, Y Liu, J Wang, S Wang, H Fan… - arXiv preprint arXiv …, 2023 - arxiv.org
Visual reinforcement learning has proven effective in solving control tasks with high-
dimensional observations. However, extracting reliable and generalizable representations …

Uncertainty-driven trajectory truncation for data augmentation in offline reinforcement learning

J Zhang, J Lyu, X Ma, J Yan, J Yang, L Wan, X Li - ECAI 2023, 2023 - ebooks.iospress.nl
Equipped with the trained environmental dynamics, model-based offline reinforcement
learning (RL) algorithms can often successfully learn good policies from fixed-sized …

CROP: Conservative Reward for Model-based Offline Policy Optimization

H Li, XH Zhou, XL Xie, SQ Liu, ZQ Feng, XY Liu… - arXiv preprint arXiv …, 2023 - arxiv.org
Offline reinforcement learning (RL) aims to optimize policy using collected data without
online interactions. Model-based approaches are particularly appealing for addressing …

Optimistic Model Rollouts for Pessimistic Offline Policy Optimization

Y Zhai, Y Li, Z Gao, X Gong, K Xu, D Feng… - Proceedings of the …, 2024 - ojs.aaai.org
Model-based offline reinforcement learning (RL) has made remarkable progress, offering a
promising avenue for improving generalization with synthetic model rollouts. Existing works …

SEABO: A Simple Search-Based Method for Offline Imitation Learning

J Lyu, X Ma, L Wan, R Liu, X Li, Z Lu - arXiv preprint arXiv:2402.03807, 2024 - arxiv.org
Offline reinforcement learning (RL) has attracted much attention due to its ability in learning
from static offline datasets and eliminating the need of interacting with the environment …

Implicit policy constraint for offline reinforcement learning

Z Peng, Y Liu, C Han, Z Zhou - CAAI Transactions on …, 2024 - Wiley Online Library
Offline reinforcement learning (RL) aims to learn policies entirely from passively collected
datasets, making it a data‐driven decision method. One of the main challenges in offline RL …