Diffusion model is an effective planner and data synthesizer for multi-task reinforcement learning

H He, C Bai, K Xu, Z Yang, W Zhang… - Advances in neural …, 2023 - proceedings.neurips.cc
Diffusion models have demonstrated highly-expressive generative capabilities in vision and
NLP. Recent studies in reinforcement learning (RL) have shown that diffusion models are …

Leveraging imitation learning in agricultural robotics: a comprehensive survey and comparative analysis

S Mahmoudi, A Davar, P Sohrabipour… - Frontiers in Robotics …, 2024 - frontiersin.org
Imitation learning (IL), a burgeoning frontier in machine learning, holds immense promise
across diverse domains. In recent years, its integration into robotics has sparked significant …

Ceil: Generalized contextual imitation learning

J Liu, L He, Y Kang, Z Zhuang… - Advances in Neural …, 2023 - proceedings.neurips.cc
In this paper, we present ContExtual Imitation Learning (CEIL), a general and broadly
applicable algorithm for imitation learning (IL). Inspired by the formulation of hindsight …

Discriminator-guided model-based offline imitation learning

W Zhang, H Xu, H Niu, P Cheng, M Li… - … on Robot Learning, 2023 - proceedings.mlr.press
Offline imitation learning (IL) is a powerful method to solve decision-making problems from
expert demonstrations without reward labels. Existing offline IL methods suffer from severe …

Offline multi-task transfer rl with representational penalization

A Bose, SS Du, M Fazel - arXiv preprint arXiv:2402.12570, 2024 - arxiv.org
We study the problem of representation transfer in offline Reinforcement Learning (RL),
where a learner has access to episodic data from a number of source tasks collected a …

Comparing model-free and model-based algorithms for offline reinforcement learning

P Swazinna, S Udluft, D Hein, T Runkler - IFAC-PapersOnLine, 2022 - Elsevier
Offline reinforcement learning (RL) Algorithms are often designed with environments such
as MuJoCo in mind, in which the planning horizon is extremely long and no noise exists. We …

[HTML][HTML] Adaptive pessimism via target Q-value for offline reinforcement learning

J Liu, Y Zhang, C Li, Y Yang, Y Liu, W Ouyang - Neural Networks, 2024 - Elsevier
Offline reinforcement learning (RL) methods learn from datasets without further environment
interaction, facing errors due to out-of-distribution (OOD) actions. Although effective methods …

User-interactive offline reinforcement learning

P Swazinna, S Udluft, T Runkler - arXiv preprint arXiv:2205.10629, 2022 - arxiv.org
Offline reinforcement learning algorithms still lack trust in practice due to the risk that the
learned policy performs worse than the original policy that generated the dataset or behaves …

Offline imitation learning with model-based reverse augmentation

JJ Shao, HS Shi, LZ Guo, YF Li - Proceedings of the 30th ACM SIGKDD …, 2024 - dl.acm.org
In offline Imitation Learning (IL), one of the main challenges is the covariate shift between
the expert observations and the actual distribution encountered by the agent, because it is …

Offline reinforcement learning in high-dimensional stochastic environments

F Hêche, O Barakat, T Desmettre, T Marx… - Neural Computing and …, 2024 - Springer
Offline reinforcement learning (RL) has emerged as a promising paradigm for real-world
applications since it aims to train policies directly from datasets of past interactions with the …