A survey of meta-reinforcement learning

J Beck, R Vuorio, EZ Liu, Z Xiong, L Zintgraf… - arXiv preprint arXiv …, 2023 - arxiv.org
While deep reinforcement learning (RL) has fueled multiple high-profile successes in
machine learning, it is held back from more widespread adoption by its often poor data …

Provable guarantees for generative behavior cloning: Bridging low-level stability and high-level behavior

A Block, A Jadbabaie, D Pfrommer… - Advances in …, 2024 - proceedings.neurips.cc
We propose a theoretical framework for studying behavior cloning of complex expert
demonstrations using generative modeling. Our framework invokes low-level controllers …

Accelerating exploration with unlabeled prior data

Q Li, J Zhang, D Ghosh, A Zhang… - Advances in Neural …, 2023 - proceedings.neurips.cc
Learning to solve tasks from a sparse reward signal is a major challenge for standard
reinforcement learning (RL) algorithms. However, in the real world, agents rarely need to …

Statistical learning under heterogenous distribution shift

M Simchowitz, A Ajay, P Agrawal… - International …, 2023 - proceedings.mlr.press
This paper studies the prediction of a target $\mathbf {z} $ from a pair of random variables
$(\mathbf {x},\mathbf {y}) $, where the ground-truth predictor is additive $\mathbb {E}[\mathbf …

Parameterizing non-parametric meta-reinforcement learning tasks via subtask decomposition

S Lee, M Cho, Y Sung - Advances in Neural Information …, 2023 - proceedings.neurips.cc
Meta-reinforcement learning (meta-RL) techniques have demonstrated remarkable success
in generalizing deep reinforcement learning across a range of tasks. Nevertheless, these …

Imitating complex trajectories: Bridging low-level stability and high-level behavior

A Block, D Pfrommer, M Simchowitz - arXiv preprint arXiv:2307.14619, 2023 - arxiv.org
We propose a theoretical framework for studying the imitation of stochastic, non-Markovian,
potentially multi-modal (ie" complex") expert demonstrations in nonlinear dynamical …

Diagnosing and remediating harmful data shifts for the responsible deployment of clinical AI models

V Subasri, A Krishnan, A Dhalla, D Pandya, D Malkin… - medRxiv, 2023 - medrxiv.org
Harmful data shifts occur when the distribution of data used to train a clinical AI system
differs significantly from the distribution of data encountered during deployment, leading to …

Train hard, fight easy: Robust meta reinforcement learning

I Greenberg, S Mannor, G Chechik… - Advances in Neural …, 2024 - proceedings.neurips.cc
A major challenge of reinforcement learning (RL) in real-world applications is the variation
between environments, tasks or clients. Meta-RL (MRL) addresses this issue by learning a …

GRAM: Generalization in Deep RL with a Robust Adaptation Module

J Queeney, X Cai, M Benosman, JP How - arXiv preprint arXiv:2412.04323, 2024 - arxiv.org
The reliable deployment of deep reinforcement learning in real-world settings requires the
ability to generalize across a variety of conditions, including both in-distribution scenarios …

基于D2GA 的逆强化学习算法

段成龙, 袁杰, 常乾坤, 张宁宁 - 计算机工程与科学, 2024 - joces.nudt.edu.cn
针对传统生成对抗逆强化学习存在的专家样本获取困难以及生成样本利用率低的问题,
提出一种基于事后经验回放策略HER 的双鉴别器生成对抗D2GA 逆强化学习算法. 在该算法中 …