Iterative Regularized Policy Optimization with Imperfect Demonstrations
G Xudong, F Dawei, K Xu, Y Zhai, C Yao… - … on Machine Learning, 2024 - openreview.net
Imitation learning heavily relies on the quality of provided demonstrations. In scenarios
where demonstrations are imperfect and rare, a prevalent approach for refining policies is …
where demonstrations are imperfect and rare, a prevalent approach for refining policies is …
OLLIE: Imitation Learning from Offline Pretraining to Online Finetuning
In this paper, we study offline-to-online Imitation Learning (IL) that pretrains an imitation
policy from static demonstration data, followed by fast finetuning with minimal environmental …
policy from static demonstration data, followed by fast finetuning with minimal environmental …
Advantage-Aware Policy Optimization for Offline Reinforcement Learning
Offline Reinforcement Learning (RL) endeavors to leverage offline datasets to craft effective
agent policy without online interaction, which imposes proper conservative constraints with …
agent policy without online interaction, which imposes proper conservative constraints with …
Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement Learning
Combining offline and online reinforcement learning (RL) techniques is indeed crucial for
achieving efficient and safe learning where data acquisition is expensive. Existing methods …
achieving efficient and safe learning where data acquisition is expensive. Existing methods …
[PDF][PDF] Data-Driven Reinforcement Learning for Optimal Motor Control in Washing Machines
C Kang, G Bae, D Kim, K Lee, D Son, C Lee, J Lee… - ieeecai.org
In this paper, we address the challenge of developing advanced motor control systems for
modern washing machines, which are required to operate under various conditions …
modern washing machines, which are required to operate under various conditions …
[PDF][PDF] Poisoning Offline Reinforcement Learning to Promote Distributional Shift in Online Finetuning
Z Yu, S Kang, X Zhang - cs.uic.edu
Offline-to-online reinforcement learning has recently been shown effective in reducing the
online sample complexity by first training from offline collected data. However, this additional …
online sample complexity by first training from offline collected data. However, this additional …