f-Policy Gradients: A General Framework for Goal-Conditioned RL using f-Divergences

文章

学术资源搜索

获得 3 条结果（用时0.03秒）

我的图书馆

f-Policy Gradients: A General Framework for Goal-Conditioned RL using f-Divergences

在引用文章中搜索

[PDF] arxiv.org

Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings

K Frans, S Park, P Abbeel, S Levine - arXiv preprint arXiv:2402.17135, 2024 - arxiv.org

Can we pre-train a generalist agent from a large amount of unlabeled offline trajectories
such that it can be immediately adapted to any new downstream tasks in a zero-shot …

被引用次数：3 相关文章所有 3 个版本

[PDF] arxiv.org

Score models for offline goal-conditioned reinforcement learning

H Sikchi, R Chitnis, A Touati, A Geramifard… - arXiv preprint arXiv …, 2023 - arxiv.org

Offline Goal-Conditioned Reinforcement Learning (GCRL) is tasked with learning to achieve
multiple goals in an environment purely from offline datasets using sparse reward functions …

被引用次数：5 相关文章所有 5 个版本

[PDF] openreview.net

[PDF][PDF] PROTO SUCCESSOR MEASURE: REPRESENTING THE

RM LEARNING - openreview.net

Having explored an environment, intelligent agents should be able to transfer their
knowledge to most downstream tasks within that environment. Referred to as “zero-shot …