Diffusion model is an effective planner and data synthesizer for multi-task reinforcement learning
Diffusion models have demonstrated highly-expressive generative capabilities in vision and
NLP. Recent studies in reinforcement learning (RL) have shown that diffusion models are …
NLP. Recent studies in reinforcement learning (RL) have shown that diffusion models are …
Leveraging imitation learning in agricultural robotics: a comprehensive survey and comparative analysis
Imitation learning (IL), a burgeoning frontier in machine learning, holds immense promise
across diverse domains. In recent years, its integration into robotics has sparked significant …
across diverse domains. In recent years, its integration into robotics has sparked significant …
Ceil: Generalized contextual imitation learning
In this paper, we present ContExtual Imitation Learning (CEIL), a general and broadly
applicable algorithm for imitation learning (IL). Inspired by the formulation of hindsight …
applicable algorithm for imitation learning (IL). Inspired by the formulation of hindsight …
Discriminator-guided model-based offline imitation learning
Offline imitation learning (IL) is a powerful method to solve decision-making problems from
expert demonstrations without reward labels. Existing offline IL methods suffer from severe …
expert demonstrations without reward labels. Existing offline IL methods suffer from severe …
Offline multi-task transfer rl with representational penalization
We study the problem of representation transfer in offline Reinforcement Learning (RL),
where a learner has access to episodic data from a number of source tasks collected a …
where a learner has access to episodic data from a number of source tasks collected a …
Comparing model-free and model-based algorithms for offline reinforcement learning
Offline reinforcement learning (RL) Algorithms are often designed with environments such
as MuJoCo in mind, in which the planning horizon is extremely long and no noise exists. We …
as MuJoCo in mind, in which the planning horizon is extremely long and no noise exists. We …
[HTML][HTML] Adaptive pessimism via target Q-value for offline reinforcement learning
Offline reinforcement learning (RL) methods learn from datasets without further environment
interaction, facing errors due to out-of-distribution (OOD) actions. Although effective methods …
interaction, facing errors due to out-of-distribution (OOD) actions. Although effective methods …
User-interactive offline reinforcement learning
Offline reinforcement learning algorithms still lack trust in practice due to the risk that the
learned policy performs worse than the original policy that generated the dataset or behaves …
learned policy performs worse than the original policy that generated the dataset or behaves …
Offline imitation learning with model-based reverse augmentation
In offline Imitation Learning (IL), one of the main challenges is the covariate shift between
the expert observations and the actual distribution encountered by the agent, because it is …
the expert observations and the actual distribution encountered by the agent, because it is …
Offline reinforcement learning in high-dimensional stochastic environments
Offline reinforcement learning (RL) has emerged as a promising paradigm for real-world
applications since it aims to train policies directly from datasets of past interactions with the …
applications since it aims to train policies directly from datasets of past interactions with the …