Efficient reinforcement learning in block mdps: A model-free representation learning approach

X Zhang, Y Song, M Uehara, M Wang… - International …, 2022 - proceedings.mlr.press
We present BRIEE, an algorithm for efficient reinforcement learning in Markov Decision
Processes with block-structured dynamics (ie, Block MDPs), where rich observations are …

Model-free representation learning and exploration in low-rank mdps

A Modi, J Chen, A Krishnamurthy, N Jiang… - Journal of Machine …, 2024 - jmlr.org
The low-rank MDP has emerged as an important model for studying representation learning
and exploration in reinforcement learning. With a known representation, several model-free …

Revisiting the linear-programming framework for offline rl with general function approximation

AE Ozdaglar, S Pattathil, J Zhang… - … on Machine Learning, 2023 - proceedings.mlr.press
Offline reinforcement learning (RL) aims to find an optimal policy for sequential decision-
making using a pre-collected dataset, without further interaction with the environment …

Offline reinforcement learning under value and density-ratio realizability: the power of gaps

J Chen, N Jiang - Uncertainty in Artificial Intelligence, 2022 - proceedings.mlr.press
We consider a challenging theoretical problem in offline reinforcement learning (RL):
obtaining sample-efficiency guarantees with a dataset lacking sufficient coverage, under …

[PDF][PDF] Structure in reinforcement learning: A survey and open problems

A Mohan, A Zhang, M Lindauer - arXiv preprint arXiv:2306.16021, 2023 - academia.edu
Reinforcement Learning (RL), bolstered by the expressive capabilities of Deep Neural
Networks (DNNs) for function approximation, has demonstrated considerable success in …

On instance-dependent bounds for offline reinforcement learning with linear function approximation

T Nguyen-Tang, M Yin, S Gupta, S Venkatesh… - Proceedings of the …, 2023 - ojs.aaai.org
Sample-efficient offline reinforcement learning (RL) with linear function approximation has
been studied extensively recently. Much of the prior work has yielded instance-independent …

Context-lumpable stochastic bandits

CW Lee, Q Liu, Y Abbasi Yadkori… - Advances in …, 2024 - proceedings.neurips.cc
We consider a contextual bandit problem with $ S $ contexts and $ K $ actions. In each
round $ t= 1, 2,\dots $ the learnerobserves a random context and chooses an action based …

Provable General Function Class Representation Learning in Multitask Bandits and MDP

R Lu, A Zhao, SS Du, G Huang - Advances in Neural …, 2022 - proceedings.neurips.cc
While multitask representation learning has become a popular approach in reinforcement
learning (RL) to boost the sample efficiency, the theoretical understanding of why and how it …

Scalable representation learning in linear contextual bandits with constant regret guarantees

A Tirinzoni, M Papini, A Touati… - Advances in Neural …, 2022 - proceedings.neurips.cc
We study the problem of representation learning in stochastic contextual linear bandits.
While the primary concern in this domain is usually to find\textit {realizable} representations …

Structure in Deep Reinforcement Learning: A Survey and Open Problems

A Mohan, A Zhang, M Lindauer - Journal of Artificial Intelligence Research, 2024 - jair.org
Reinforcement Learning (RL), bolstered by the expressive capabilities of Deep Neural
Networks (DNNs) for function approximation, has demonstrated considerable success in …