Model-free episodic control
State of the art deep reinforcement learning algorithms take many millions of interactions to
attain human-level performance. Humans, on the other hand, can very quickly exploit highly …
attain human-level performance. Humans, on the other hand, can very quickly exploit highly …
Generalizable episodic memory for deep reinforcement learning
Episodic memory-based methods can rapidly latch onto past successful strategies by a non-
parametric memory and improve sample efficiency of traditional reinforcement learning …
parametric memory and improve sample efficiency of traditional reinforcement learning …
Neural episodic control
Deep reinforcement learning methods attain super-human performance in a wide range of
environments. Such methods are grossly inefficient, often taking orders of magnitudes more …
environments. Such methods are grossly inefficient, often taking orders of magnitudes more …
A unifying view of optimism in episodic reinforcement learning
G Neu, C Pike-Burke - Advances in Neural Information …, 2020 - proceedings.neurips.cc
The principle of``optimism in the face of uncertainty''underpins many theoretically successful
reinforcement learning algorithms. In this paper we provide a general framework for …
reinforcement learning algorithms. In this paper we provide a general framework for …
Episodic memory deep q-networks
Reinforcement learning (RL) algorithms have made huge progress in recent years by
leveraging the power of deep neural networks (DNN). Despite the success, deep RL …
leveraging the power of deep neural networks (DNN). Despite the success, deep RL …
Exploration via elliptical episodic bonuses
In recent years, a number of reinforcement learning (RL) methods have been pro-posed to
explore complex environments which differ across episodes. In this work, we show that the …
explore complex environments which differ across episodes. In this work, we show that the …
Recall traces: Backtracking models for efficient reinforcement learning
In many environments only a tiny subset of all states yield high reward. In these cases, few of
the interactions with the environment provide a relevant learning signal. Hence, we may …
the interactions with the environment provide a relevant learning signal. Hence, we may …
Deep reinforcement learning amidst lifelong non-stationarity
As humans, our goals and our environment are persistently changing throughout our lifetime
based on our experiences, actions, and internal and external drives. In contrast, typical …
based on our experiences, actions, and internal and external drives. In contrast, typical …
Learning to reinforcement learn
In recent years deep reinforcement learning (RL) systems have attained superhuman
performance in a number of challenging task domains. However, a major limitation of such …
performance in a number of challenging task domains. However, a major limitation of such …
Tighter problem-dependent regret bounds in reinforcement learning without domain knowledge using value function bounds
A Zanette, E Brunskill - International Conference on Machine …, 2019 - proceedings.mlr.press
Strong worst-case performance bounds for episodic reinforcement learning exist but
fortunately in practice RL algorithms perform much better than such bounds would predict …
fortunately in practice RL algorithms perform much better than such bounds would predict …