R3m: A universal visual representation for robot manipulation

S Nair, A Rajeswaran, V Kumar, C Finn… - arXiv preprint arXiv …, 2022 - arxiv.org
We study how visual representations pre-trained on diverse human video data can enable
data-efficient learning of downstream robotic manipulation tasks. Concretely, we pre-train a …

Open x-embodiment: Robotic learning datasets and rt-x models

A Padalkar, A Pooley, A Jain, A Bewley… - arXiv preprint arXiv …, 2023 - arxiv.org
Large, high-capacity models trained on diverse datasets have shown remarkable successes
on efficiently tackling downstream applications. In domains from NLP to Computer Vision …

Contrastive learning as goal-conditioned reinforcement learning

B Eysenbach, T Zhang, S Levine… - Advances in Neural …, 2022 - proceedings.neurips.cc
In reinforcement learning (RL), it is easier to solve a task if given a good representation.
While deep RL should automatically acquire such good representations, prior work often …

Reinforcement learning with action-free pre-training from videos

Y Seo, K Lee, SL James… - … Conference on Machine …, 2022 - proceedings.mlr.press
Recent unsupervised pre-training methods have shown to be effective on language and
vision domains by learning useful representations for multiple downstream tasks. In this …

Mimicplay: Long-horizon imitation learning by watching human play

C Wang, L Fan, J Sun, R Zhang, L Fei-Fei, D Xu… - arXiv preprint arXiv …, 2023 - arxiv.org
Imitation learning from human demonstrations is a promising paradigm for teaching robots
manipulation skills in the real world. However, learning complex long-horizon tasks often …

Learning generalizable robotic reward functions from" in-the-wild" human videos

AS Chen, S Nair, C Finn - arXiv preprint arXiv:2103.16817, 2021 - arxiv.org
We are motivated by the goal of generalist robots that can complete a wide range of tasks
across many environments. Critical to this is the robot's ability to acquire some metric of task …

Reinforcement learning with videos: Combining offline observations with interaction

K Schmeckpeper, O Rybkin, K Daniilidis… - arXiv preprint arXiv …, 2020 - arxiv.org
Reinforcement learning is a powerful framework for robots to acquire skills from experience,
but often requires a substantial amount of online data collection. As a result, it is difficult to …

Pre-training contextualized world models with in-the-wild videos for reinforcement learning

J Wu, H Ma, C Deng, M Long - Advances in Neural …, 2024 - proceedings.neurips.cc
Unsupervised pre-training methods utilizing large and diverse datasets have achieved
tremendous success across a range of domains. Recent work has investigated such …

Semi-supervised offline reinforcement learning with action-free trajectories

Q Zheng, M Henaff, B Amos… - … conference on machine …, 2023 - proceedings.mlr.press
Natural agents can effectively learn from multiple data sources that differ in size, quality, and
types of measurements. We study this heterogeneity in the context of offline reinforcement …

Physion: Evaluating physical prediction from vision in humans and machines

DM Bear, E Wang, D Mrowca, FJ Binder… - arXiv preprint arXiv …, 2021 - arxiv.org
While current vision algorithms excel at many challenging tasks, it is unclear how well they
understand the physical dynamics of real-world environments. Here we introduce Physion, a …