Efficient reinforcement learning in block mdps: A model-free representation learning approach
We present BRIEE, an algorithm for efficient reinforcement learning in Markov Decision
Processes with block-structured dynamics (ie, Block MDPs), where rich observations are …
Processes with block-structured dynamics (ie, Block MDPs), where rich observations are …
Model-free representation learning and exploration in low-rank mdps
The low-rank MDP has emerged as an important model for studying representation learning
and exploration in reinforcement learning. With a known representation, several model-free …
and exploration in reinforcement learning. With a known representation, several model-free …
Revisiting the linear-programming framework for offline rl with general function approximation
Offline reinforcement learning (RL) aims to find an optimal policy for sequential decision-
making using a pre-collected dataset, without further interaction with the environment …
making using a pre-collected dataset, without further interaction with the environment …
Offline reinforcement learning under value and density-ratio realizability: the power of gaps
We consider a challenging theoretical problem in offline reinforcement learning (RL):
obtaining sample-efficiency guarantees with a dataset lacking sufficient coverage, under …
obtaining sample-efficiency guarantees with a dataset lacking sufficient coverage, under …
[PDF][PDF] Structure in reinforcement learning: A survey and open problems
Reinforcement Learning (RL), bolstered by the expressive capabilities of Deep Neural
Networks (DNNs) for function approximation, has demonstrated considerable success in …
Networks (DNNs) for function approximation, has demonstrated considerable success in …
On instance-dependent bounds for offline reinforcement learning with linear function approximation
Sample-efficient offline reinforcement learning (RL) with linear function approximation has
been studied extensively recently. Much of the prior work has yielded instance-independent …
been studied extensively recently. Much of the prior work has yielded instance-independent …
Context-lumpable stochastic bandits
We consider a contextual bandit problem with $ S $ contexts and $ K $ actions. In each
round $ t= 1, 2,\dots $ the learnerobserves a random context and chooses an action based …
round $ t= 1, 2,\dots $ the learnerobserves a random context and chooses an action based …
Provable General Function Class Representation Learning in Multitask Bandits and MDP
While multitask representation learning has become a popular approach in reinforcement
learning (RL) to boost the sample efficiency, the theoretical understanding of why and how it …
learning (RL) to boost the sample efficiency, the theoretical understanding of why and how it …
Scalable representation learning in linear contextual bandits with constant regret guarantees
We study the problem of representation learning in stochastic contextual linear bandits.
While the primary concern in this domain is usually to find\textit {realizable} representations …
While the primary concern in this domain is usually to find\textit {realizable} representations …
Structure in Deep Reinforcement Learning: A Survey and Open Problems
Reinforcement Learning (RL), bolstered by the expressive capabilities of Deep Neural
Networks (DNNs) for function approximation, has demonstrated considerable success in …
Networks (DNNs) for function approximation, has demonstrated considerable success in …