Distributionally adaptive meta reinforcement learning

J Beck, R Vuorio, EZ Liu, Z Xiong, L Zintgraf… - arXiv preprint arXiv …, 2023 - arxiv.org

While deep reinforcement learning (RL) has fueled multiple high-profile successes in
machine learning, it is held back from more widespread adoption by its often poor data …

被引用次数：156 相关文章所有 2 个版本

[PDF] neurips.cc

Provable guarantees for generative behavior cloning: Bridging low-level stability and high-level behavior

A Block, A Jadbabaie, D Pfrommer… - Advances in …, 2024 - proceedings.neurips.cc

We propose a theoretical framework for studying behavior cloning of complex expert
demonstrations using generative modeling. Our framework invokes low-level controllers …

被引用次数：16 相关文章所有 3 个版本

[PDF] neurips.cc

Accelerating exploration with unlabeled prior data

Q Li, J Zhang, D Ghosh, A Zhang… - Advances in Neural …, 2023 - proceedings.neurips.cc

Learning to solve tasks from a sparse reward signal is a major challenge for standard
reinforcement learning (RL) algorithms. However, in the real world, agents rarely need to …

被引用次数：9 相关文章所有 5 个版本

[PDF] mlr.press

Statistical learning under heterogenous distribution shift

M Simchowitz, A Ajay, P Agrawal… - International …, 2023 - proceedings.mlr.press

This paper studies the prediction of a target $\mathbf {z} $ from a pair of random variables
$(\mathbf {x},\mathbf {y}) $, where the ground-truth predictor is additive $\mathbb {E}[\mathbf …

被引用次数：6 相关文章所有 6 个版本

[PDF] neurips.cc

Parameterizing non-parametric meta-reinforcement learning tasks via subtask decomposition

S Lee, M Cho, Y Sung - Advances in Neural Information …, 2023 - proceedings.neurips.cc

Meta-reinforcement learning (meta-RL) techniques have demonstrated remarkable success
in generalizing deep reinforcement learning across a range of tasks. Nevertheless, these …

被引用次数：4 相关文章所有 4 个版本

[PDF] arxiv.org

Imitating complex trajectories: Bridging low-level stability and high-level behavior

A Block, D Pfrommer, M Simchowitz - arXiv preprint arXiv:2307.14619, 2023 - arxiv.org

We propose a theoretical framework for studying the imitation of stochastic, non-Markovian,
potentially multi-modal (ie" complex") expert demonstrations in nonlinear dynamical …

被引用次数：2 相关文章所有 2 个版本

[PDF] medrxiv.org

Diagnosing and remediating harmful data shifts for the responsible deployment of clinical AI models

V Subasri, A Krishnan, A Dhalla, D Pandya, D Malkin… - medRxiv, 2023 - medrxiv.org

Harmful data shifts occur when the distribution of data used to train a clinical AI system
differs significantly from the distribution of data encountered during deployment, leading to …

被引用次数：5 相关文章所有 2 个版本

[PDF] neurips.cc

Train hard, fight easy: Robust meta reinforcement learning

I Greenberg, S Mannor, G Chechik… - Advances in Neural …, 2024 - proceedings.neurips.cc

A major challenge of reinforcement learning (RL) in real-world applications is the variation
between environments, tasks or clients. Meta-RL (MRL) addresses this issue by learning a …

被引用次数：5 相关文章所有 7 个版本

[PDF] arxiv.org

GRAM: Generalization in Deep RL with a Robust Adaptation Module

J Queeney, X Cai, M Benosman, JP How - arXiv preprint arXiv:2412.04323, 2024 - arxiv.org

The reliable deployment of deep reinforcement learning in real-world settings requires the
ability to generalize across a variety of conditions, including both in-distribution scenarios …

基于D2GA 的逆强化学习算法

段成龙，袁杰，常乾坤，张宁宁 - 计算机工程与科学, 2024 - joces.nudt.edu.cn

针对传统生成对抗逆强化学习存在的专家样本获取困难以及生成样本利用率低的问题,
提出一种基于事后经验回放策略HER 的双鉴别器生成对抗D2GA 逆强化学习算法. 在该算法中 …