How to train your robot with deep reinforcement learning: lessons we have learned

J Ibarz, J Tan, C Finn, M Kalakrishnan… - … Journal of Robotics …, 2021 - journals.sagepub.com
Deep reinforcement learning (RL) has emerged as a promising approach for autonomously
acquiring complex behaviors from low-level sensor observations. Although a large portion of …

Survey on large language model-enhanced reinforcement learning: Concept, taxonomy, and methods

Y Cao, H Zhao, Y Cheng, T Shu, Y Chen… - … on Neural Networks …, 2024 - ieeexplore.ieee.org
With extensive pretrained knowledge and high-level general capabilities, large language
models (LLMs) emerge as a promising avenue to augment reinforcement learning (RL) in …

On the opportunities and risks of foundation models

R Bommasani, DA Hudson, E Adeli, R Altman… - arXiv preprint arXiv …, 2021 - arxiv.org
AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …

Vip: Towards universal visual reward and representation via value-implicit pre-training

YJ Ma, S Sodhani, D Jayaraman, O Bastani… - arXiv preprint arXiv …, 2022 - arxiv.org
Reward and representation learning are two long-standing challenges for learning an
expanding set of robot manipulation skills from sensory observations. Given the inherent …

What matters in learning from offline human demonstrations for robot manipulation

A Mandlekar, D Xu, J Wong, S Nasiriany… - arXiv preprint arXiv …, 2021 - arxiv.org
Imitating human demonstrations is a promising approach to endow robots with various
manipulation capabilities. While recent advances have been made in imitation learning and …

Learning language-conditioned robot behavior from offline data and crowd-sourced annotation

S Nair, E Mitchell, K Chen… - Conference on Robot …, 2022 - proceedings.mlr.press
We study the problem of learning a range of vision-based manipulation tasks from a large
offline dataset of robot interaction. In order to accomplish this, humans need easy and …

Language conditioned imitation learning over unstructured data

C Lynch, P Sermanet - arXiv preprint arXiv:2005.07648, 2020 - arxiv.org
Natural language is perhaps the most flexible and intuitive way for humans to communicate
tasks to a robot. Prior work in imitation learning typically requires each task be specified with …

Can foundation models perform zero-shot task specification for robot manipulation?

Y Cui, S Niekum, A Gupta, V Kumar… - … for dynamics and …, 2022 - proceedings.mlr.press
Task specification is at the core of programming autonomous robots. A low-effort modality for
task specification is critical for engagement of non-expert end users and ultimate adoption of …

Vision-language models as success detectors

Y Du, K Konyushkova, M Denil, A Raju… - arXiv preprint arXiv …, 2023 - arxiv.org
Detecting successful behaviour is crucial for training intelligent agents. As such,
generalisable reward models are a prerequisite for agents that can learn to generalise their …

How to leverage unlabeled data in offline reinforcement learning

T Yu, A Kumar, Y Chebotar… - International …, 2022 - proceedings.mlr.press
Offline reinforcement learning (RL) can learn control policies from static datasets but, like
standard RL methods, it requires reward annotations for every transition. In many cases …