Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings

K Frans, S Park, P Abbeel, S Levine - arXiv preprint arXiv:2402.17135, 2024 - arxiv.org
Can we pre-train a generalist agent from a large amount of unlabeled offline trajectories
such that it can be immediately adapted to any new downstream tasks in a zero-shot …

Score models for offline goal-conditioned reinforcement learning

H Sikchi, R Chitnis, A Touati, A Geramifard… - arXiv preprint arXiv …, 2023 - arxiv.org
Offline Goal-Conditioned Reinforcement Learning (GCRL) is tasked with learning to achieve
multiple goals in an environment purely from offline datasets using sparse reward functions …

[PDF][PDF] PROTO SUCCESSOR MEASURE: REPRESENTING THE

RM LEARNING - openreview.net
Having explored an environment, intelligent agents should be able to transfer their
knowledge to most downstream tasks within that environment. Referred to as “zero-shot …