R3m: A universal visual representation for robot manipulation
We study how visual representations pre-trained on diverse human video data can enable
data-efficient learning of downstream robotic manipulation tasks. Concretely, we pre-train a …
data-efficient learning of downstream robotic manipulation tasks. Concretely, we pre-train a …
Open x-embodiment: Robotic learning datasets and rt-x models
Large, high-capacity models trained on diverse datasets have shown remarkable successes
on efficiently tackling downstream applications. In domains from NLP to Computer Vision …
on efficiently tackling downstream applications. In domains from NLP to Computer Vision …
Contrastive learning as goal-conditioned reinforcement learning
In reinforcement learning (RL), it is easier to solve a task if given a good representation.
While deep RL should automatically acquire such good representations, prior work often …
While deep RL should automatically acquire such good representations, prior work often …
Reinforcement learning with action-free pre-training from videos
Recent unsupervised pre-training methods have shown to be effective on language and
vision domains by learning useful representations for multiple downstream tasks. In this …
vision domains by learning useful representations for multiple downstream tasks. In this …
Mimicplay: Long-horizon imitation learning by watching human play
Imitation learning from human demonstrations is a promising paradigm for teaching robots
manipulation skills in the real world. However, learning complex long-horizon tasks often …
manipulation skills in the real world. However, learning complex long-horizon tasks often …
Learning generalizable robotic reward functions from" in-the-wild" human videos
We are motivated by the goal of generalist robots that can complete a wide range of tasks
across many environments. Critical to this is the robot's ability to acquire some metric of task …
across many environments. Critical to this is the robot's ability to acquire some metric of task …
Reinforcement learning with videos: Combining offline observations with interaction
Reinforcement learning is a powerful framework for robots to acquire skills from experience,
but often requires a substantial amount of online data collection. As a result, it is difficult to …
but often requires a substantial amount of online data collection. As a result, it is difficult to …
Pre-training contextualized world models with in-the-wild videos for reinforcement learning
Unsupervised pre-training methods utilizing large and diverse datasets have achieved
tremendous success across a range of domains. Recent work has investigated such …
tremendous success across a range of domains. Recent work has investigated such …
Semi-supervised offline reinforcement learning with action-free trajectories
Natural agents can effectively learn from multiple data sources that differ in size, quality, and
types of measurements. We study this heterogeneity in the context of offline reinforcement …
types of measurements. We study this heterogeneity in the context of offline reinforcement …
Physion: Evaluating physical prediction from vision in humans and machines
While current vision algorithms excel at many challenging tasks, it is unclear how well they
understand the physical dynamics of real-world environments. Here we introduce Physion, a …
understand the physical dynamics of real-world environments. Here we introduce Physion, a …