Bilinear classes: A structural framework for provable generalization in rl
Abstract This work introduces Bilinear Classes, a new structural framework, which permit
generalization in reinforcement learning in a wide variety of settings through the use of …
generalization in reinforcement learning in a wide variety of settings through the use of …
Randomized ensembled double q-learning: Learning fast without a model
Using a high Update-To-Data (UTD) ratio, model-based methods have recently achieved
much higher sample efficiency than previous model-free methods for continuous-action DRL …
much higher sample efficiency than previous model-free methods for continuous-action DRL …
Deep reinforcement learning with plasticity injection
A growing body of evidence suggests that neural networks employed in deep reinforcement
learning (RL) gradually lose their plasticity, the ability to learn from new data; however, the …
learning (RL) gradually lose their plasticity, the ability to learn from new data; however, the …
Safe reinforcement learning by imagining the near future
Safe reinforcement learning is a promising path toward applying reinforcement learning
algorithms to real-world problems, where suboptimal behaviors may lead to actual negative …
algorithms to real-world problems, where suboptimal behaviors may lead to actual negative …
Vrl3: A data-driven framework for visual deep reinforcement learning
We propose VRL3, a powerful data-driven framework with a simple design for solving
challenging visual deep reinforcement learning (DRL) tasks. We analyze a number of major …
challenging visual deep reinforcement learning (DRL) tasks. We analyze a number of major …
Learning barrier certificates: Towards safe reinforcement learning with zero training-time violations
Training-time safety violations have been a major concern when we deploy reinforcement
learning algorithms in the real world. This paper explores the possibility of safe RL …
learning algorithms in the real world. This paper explores the possibility of safe RL …
When is agnostic reinforcement learning statistically tractable?
We study the problem of agnostic PAC reinforcement learning (RL): given a policy class $\Pi
$, how many rounds of interaction with an unknown MDP (with a potentially large state and …
$, how many rounds of interaction with an unknown MDP (with a potentially large state and …
Model-based visual planning with self-supervised functional distances
A generalist robot must be able to complete a variety of tasks in its environment. One
appealing way to specify each task is in terms of a goal observation. However, learning goal …
appealing way to specify each task is in terms of a goal observation. However, learning goal …
Scaling active inference
In reinforcement learning (RL), agents often operate in partially observed and uncertain
environments. Model-based RL suggests that this is best achieved by learning and …
environments. Model-based RL suggests that this is best achieved by learning and …
Provable model-based nonlinear bandit and reinforcement learning: Shelve optimism, embrace virtual curvature
This paper studies model-based bandit and reinforcement learning (RL) with nonlinear
function approximations. We propose to study convergence to approximate local maxima …
function approximations. We propose to study convergence to approximate local maxima …