Bilinear classes: A structural framework for provable generalization in rl

S Du, S Kakade, J Lee, S Lovett… - International …, 2021 - proceedings.mlr.press
Abstract This work introduces Bilinear Classes, a new structural framework, which permit
generalization in reinforcement learning in a wide variety of settings through the use of …

Randomized ensembled double q-learning: Learning fast without a model

X Chen, C Wang, Z Zhou, K Ross - arXiv preprint arXiv:2101.05982, 2021 - arxiv.org
Using a high Update-To-Data (UTD) ratio, model-based methods have recently achieved
much higher sample efficiency than previous model-free methods for continuous-action DRL …

Deep reinforcement learning with plasticity injection

E Nikishin, J Oh, G Ostrovski, C Lyle… - Advances in …, 2024 - proceedings.neurips.cc
A growing body of evidence suggests that neural networks employed in deep reinforcement
learning (RL) gradually lose their plasticity, the ability to learn from new data; however, the …

Safe reinforcement learning by imagining the near future

G Thomas, Y Luo, T Ma - Advances in Neural Information …, 2021 - proceedings.neurips.cc
Safe reinforcement learning is a promising path toward applying reinforcement learning
algorithms to real-world problems, where suboptimal behaviors may lead to actual negative …

Vrl3: A data-driven framework for visual deep reinforcement learning

C Wang, X Luo, K Ross, D Li - Advances in Neural …, 2022 - proceedings.neurips.cc
We propose VRL3, a powerful data-driven framework with a simple design for solving
challenging visual deep reinforcement learning (DRL) tasks. We analyze a number of major …

Learning barrier certificates: Towards safe reinforcement learning with zero training-time violations

Y Luo, T Ma - Advances in Neural Information Processing …, 2021 - proceedings.neurips.cc
Training-time safety violations have been a major concern when we deploy reinforcement
learning algorithms in the real world. This paper explores the possibility of safe RL …

When is agnostic reinforcement learning statistically tractable?

Z Jia, G Li, A Rakhlin, A Sekhari… - Advances in Neural …, 2024 - proceedings.neurips.cc
We study the problem of agnostic PAC reinforcement learning (RL): given a policy class $\Pi
$, how many rounds of interaction with an unknown MDP (with a potentially large state and …

Model-based visual planning with self-supervised functional distances

S Tian, S Nair, F Ebert, S Dasari, B Eysenbach… - arXiv preprint arXiv …, 2020 - arxiv.org
A generalist robot must be able to complete a variety of tasks in its environment. One
appealing way to specify each task is in terms of a goal observation. However, learning goal …

Scaling active inference

A Tschantz, M Baltieri, AK Seth… - 2020 international joint …, 2020 - ieeexplore.ieee.org
In reinforcement learning (RL), agents often operate in partially observed and uncertain
environments. Model-based RL suggests that this is best achieved by learning and …

Provable model-based nonlinear bandit and reinforcement learning: Shelve optimism, embrace virtual curvature

K Dong, J Yang, T Ma - Advances in neural information …, 2021 - proceedings.neurips.cc
This paper studies model-based bandit and reinforcement learning (RL) with nonlinear
function approximations. We propose to study convergence to approximate local maxima …