Conservative data sharing for multi-task offline reinforcement learning

L Shi, G Li, Y Wei, Y Chen… - … conference on machine …, 2022 - proceedings.mlr.press

Offline or batch reinforcement learning seeks to learn a near-optimal policy using history
data without active exploration of the environment. To counter the insufficient coverage and …

被引用次数：94 相关文章所有 10 个版本

[PDF] neurips.cc

Diffusion model is an effective planner and data synthesizer for multi-task reinforcement learning

H He, C Bai, K Xu, Z Yang, W Zhang… - Advances in neural …, 2024 - proceedings.neurips.cc

Diffusion models have demonstrated highly-expressive generative capabilities in vision and
NLP. Recent studies in reinforcement learning (RL) have shown that diffusion models are …

被引用次数：32 相关文章所有 5 个版本

[PDF] arxiv.org

Calvin: A benchmark for language-conditioned policy learning for long-horizon robot manipulation tasks

O Mees, L Hermann, E Rosete-Beas… - IEEE Robotics and …, 2022 - ieeexplore.ieee.org

General-purpose robots coexisting with humans in their environment must learn to relate
human language to their perceptions and actions to be useful in a range of daily tasks …

被引用次数：128 相关文章所有 5 个版本

[PDF] mlr.press

How to leverage unlabeled data in offline reinforcement learning

T Yu, A Kumar, Y Chebotar… - International …, 2022 - proceedings.mlr.press

Offline reinforcement learning (RL) can learn control policies from static datasets but, like
standard RL methods, it requires reward annotations for every transition. In many cases …

被引用次数：59 相关文章所有 5 个版本

[PDF] arxiv.org

Pre-training for robots: Offline rl enables learning new tasks from a handful of trials

A Kumar, A Singh, F Ebert, M Nakamoto… - arXiv preprint arXiv …, 2022 - arxiv.org

Progress in deep learning highlights the tremendous potential of utilizing diverse robotic
datasets for attaining effective generalization and makes it enticing to consider leveraging …

被引用次数：45 相关文章所有 3 个版本

[PDF] mlr.press

Hierarchical diffusion for offline decision making

W Li, X Wang, B Jin, H Zha - International Conference on …, 2023 - proceedings.mlr.press

Offline reinforcement learning typically introduces a hierarchical structure to solve the long-
horizon problem so as to address its thorny issue of variance accumulation. Problems of …

被引用次数：16 相关文章所有 5 个版本

[PDF] neurips.cc

Beyond uniform sampling: Offline reinforcement learning with imbalanced datasets

ZW Hong, A Kumar, S Karnik… - Advances in …, 2023 - proceedings.neurips.cc

Offline reinforcement learning (RL) enables learning a decision-making policy without
interaction with the environment. This makes it particularly beneficial in situations where …

被引用次数：6 相关文章所有 6 个版本

[PDF] mlr.press

Don't start from scratch: Leveraging prior data to automate robotic reinforcement learning

HR Walke, JH Yang, A Yu, A Kumar… - … on Robot Learning, 2023 - proceedings.mlr.press

Reinforcement learning (RL) algorithms hold the promise of enabling autonomous skill
acquisition for robotic systems. However, in practice, real-world robotic RL typically requires …

被引用次数：29 相关文章所有 5 个版本

[PDF] ieee.org

The efficacy of pessimism in asynchronous Q-learning

Y Yan, G Li, Y Chen, J Fan - IEEE Transactions on Information …, 2023 - ieeexplore.ieee.org

This paper is concerned with the asynchronous form of Q-learning, which applies a
stochastic approximation scheme to Markovian data samples. Motivated by the recent …

被引用次数：48 相关文章所有 8 个版本

[PDF] mlr.press

Future-conditioned unsupervised pretraining for decision transformer

Z Xie, Z Lin, D Ye, Q Fu, Y Wei… - … Conference on Machine …, 2023 - proceedings.mlr.press

Recent research in offline reinforcement learning (RL) has demonstrated that return-
conditioned supervised learning is a powerful paradigm for decision-making problems …

被引用次数：14 相关文章所有 8 个版本