End-to-end robotic reinforcement learning without reward engineering

J Ibarz, J Tan, C Finn, M Kalakrishnan… - … Journal of Robotics …, 2021 - journals.sagepub.com

Deep reinforcement learning (RL) has emerged as a promising approach for autonomously
acquiring complex behaviors from low-level sensor observations. Although a large portion of …

被引用次数：668 相关文章所有 7 个版本

[PDF] arxiv.org

Survey on large language model-enhanced reinforcement learning: Concept, taxonomy, and methods

Y Cao, H Zhao, Y Cheng, T Shu, Y Chen… - … on Neural Networks …, 2024 - ieeexplore.ieee.org

With extensive pretrained knowledge and high-level general capabilities, large language
models (LLMs) emerge as a promising avenue to augment reinforcement learning (RL) in …

被引用次数：33 相关文章所有 2 个版本

[PDF] arxiv.org

On the opportunities and risks of foundation models

R Bommasani, DA Hudson, E Adeli, R Altman… - arXiv preprint arXiv …, 2021 - arxiv.org

AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …

被引用次数：4534 相关文章所有 2 个版本

[PDF] arxiv.org

Vip: Towards universal visual reward and representation via value-implicit pre-training

YJ Ma, S Sodhani, D Jayaraman, O Bastani… - arXiv preprint arXiv …, 2022 - arxiv.org

Reward and representation learning are two long-standing challenges for learning an
expanding set of robot manipulation skills from sensory observations. Given the inherent …

被引用次数：241 相关文章所有 5 个版本

[PDF] openreview.net

What matters in learning from offline human demonstrations for robot manipulation

A Mandlekar, D Xu, J Wong, S Nasiriany… - arXiv preprint arXiv …, 2021 - arxiv.org

Imitating human demonstrations is a promising approach to endow robots with various
manipulation capabilities. While recent advances have been made in imitation learning and …

被引用次数：408 相关文章所有 4 个版本

[PDF] mlr.press

Learning language-conditioned robot behavior from offline data and crowd-sourced annotation

S Nair, E Mitchell, K Chen… - Conference on Robot …, 2022 - proceedings.mlr.press

We study the problem of learning a range of vision-based manipulation tasks from a large
offline dataset of robot interaction. In order to accomplish this, humans need easy and …

被引用次数：158 相关文章所有 5 个版本

[PDF] arxiv.org

Language conditioned imitation learning over unstructured data

C Lynch, P Sermanet - arXiv preprint arXiv:2005.07648, 2020 - arxiv.org

Natural language is perhaps the most flexible and intuitive way for humans to communicate
tasks to a robot. Prior work in imitation learning typically requires each task be specified with …

被引用次数：242 相关文章所有 8 个版本

[PDF] mlr.press

Can foundation models perform zero-shot task specification for robot manipulation?

Y Cui, S Niekum, A Gupta, V Kumar… - … for dynamics and …, 2022 - proceedings.mlr.press

Task specification is at the core of programming autonomous robots. A low-effort modality for
task specification is critical for engagement of non-expert end users and ultimate adoption of …

被引用次数：87 相关文章所有 5 个版本

[PDF] arxiv.org

Vision-language models as success detectors

Y Du, K Konyushkova, M Denil, A Raju… - arXiv preprint arXiv …, 2023 - arxiv.org

Detecting successful behaviour is crucial for training intelligent agents. As such,
generalisable reward models are a prerequisite for agents that can learn to generalise their …

被引用次数：67 相关文章所有 3 个版本

[PDF] mlr.press

How to leverage unlabeled data in offline reinforcement learning

T Yu, A Kumar, Y Chebotar… - International …, 2022 - proceedings.mlr.press

Offline reinforcement learning (RL) can learn control policies from static datasets but, like
standard RL methods, it requires reward annotations for every transition. In many cases …

被引用次数：67 相关文章所有 5 个版本