Bellman eluder dimension: New rich classes of rl problems, and sample-efficient algorithms
Finding the minimal structural assumptions that empower sample-efficient learning is one of
the most important research directions in Reinforcement Learning (RL). This paper …
the most important research directions in Reinforcement Learning (RL). This paper …
Efficient model-free exploration in low-rank mdps
A major challenge in reinforcement learning is to develop practical, sample-efficient
algorithms for exploration in high-dimensional domains where generalization and function …
algorithms for exploration in high-dimensional domains where generalization and function …
Model-free representation learning and exploration in low-rank mdps
The low-rank MDP has emerged as an important model for studying representation learning
and exploration in reinforcement learning. With a known representation, several model-free …
and exploration in reinforcement learning. With a known representation, several model-free …
On reward-free reinforcement learning with linear function approximation
Reward-free reinforcement learning (RL) is a framework which is suitable for both the batch
RL setting and the setting where there are many reward functions of interest. During the …
RL setting and the setting where there are many reward functions of interest. During the …
The power of exploiter: Provable multi-agent rl in large state spaces
Modern reinforcement learning (RL) commonly engages practical problems with large state
spaces, where function approximation must be deployed to approximate either the value …
spaces, where function approximation must be deployed to approximate either the value …
Instance-dependent complexity of contextual bandits and reinforcement learning: A disagreement-based perspective
In the classical multi-armed bandit problem, instance-dependent algorithms attain improved
performance on" easy" problems with a gap between the best and second-best arm. Are …
performance on" easy" problems with a gap between the best and second-best arm. Are …
On function approximation in reinforcement learning: Optimism in the face of large state spaces
The classical theory of reinforcement learning (RL) has focused on tabular and linear
representations of value functions. Further progress hinges on combining RL with modern …
representations of value functions. Further progress hinges on combining RL with modern …
Towards general function approximation in zero-sum markov games
This paper considers two-player zero-sum finite-horizon Markov games with simultaneous
moves. The study focuses on the challenging settings where the value function or the model …
moves. The study focuses on the challenging settings where the value function or the model …
A provably efficient model-free posterior sampling method for episodic reinforcement learning
Thompson Sampling is one of the most effective methods for contextual bandits and has
been generalized to posterior sampling for certain MDP settings. However, existing posterior …
been generalized to posterior sampling for certain MDP settings. However, existing posterior …
Risk-sensitive reinforcement learning with function approximation: A debiasing approach
We study function approximation for episodic reinforcement learning with entropic risk
measure. We first propose an algorithm with linear function approximation. Compared to …
measure. We first propose an algorithm with linear function approximation. Compared to …