When is partially observable reinforcement learning not scary?
Partial observability is ubiquitous in applications of Reinforcement Learning (RL), in which
agents learn to make a sequence of decisions despite lacking complete information about …
agents learn to make a sequence of decisions despite lacking complete information about …
Provably efficient reinforcement learning in partially observable dynamical systems
Abstract We study Reinforcement Learning for partially observable systems using function
approximation. We propose a new PO-bilinear framework, that is general enough to include …
approximation. We propose a new PO-bilinear framework, that is general enough to include …
Optimistic mle: A generic model-based algorithm for partially observable sequential decision making
This paper introduces a simple efficient learning algorithms for general sequential decision
making. The algorithm combines Optimism for exploration with Maximum Likelihood …
making. The algorithm combines Optimism for exploration with Maximum Likelihood …
Learning in observable pomdps, without computationally intractable oracles
Much of reinforcement learning theory is built on top of oracles that are computationally hard
to implement. Specifically for learning near-optimal policies in Partially Observable Markov …
to implement. Specifically for learning near-optimal policies in Partially Observable Markov …
Pac reinforcement learning for predictive state representations
In this paper we study online Reinforcement Learning (RL) in partially observable dynamical
systems. We focus on the Predictive State Representations (PSRs) model, which is an …
systems. We focus on the Predictive State Representations (PSRs) model, which is an …
Learning in pomdps is sample-efficient with hindsight observability
POMDPs capture a broad class of decision making problems, but hardness results suggest
that learning is intractable even in simple settings due to the inherent partial observability …
that learning is intractable even in simple settings due to the inherent partial observability …
Gec: A unified framework for interactive decision making in mdp, pomdp, and beyond
We study sample efficient reinforcement learning (RL) under the general framework of
interactive decision making, which includes Markov decision process (MDP), partially …
interactive decision making, which includes Markov decision process (MDP), partially …
Future-dependent value-based off-policy evaluation in pomdps
We study off-policy evaluation (OPE) for partially observable MDPs (POMDPs) with general
function approximation. Existing methods such as sequential importance sampling …
function approximation. Existing methods such as sequential importance sampling …
Lower bounds for learning in revealing POMDPs
This paper studies the fundamental limits of reinforcement learning (RL) in the challenging
partially observable setting. While it is well-established that learning in Partially Observable …
partially observable setting. While it is well-established that learning in Partially Observable …
Posterior sampling for competitive RL: function approximation and partial observation
S Qiu, Z Dai, H Zhong, Z Wang… - Advances in Neural …, 2024 - proceedings.neurips.cc
This paper investigates posterior sampling algorithms for competitive reinforcement learning
(RL) in the context of general function approximations. Focusing on zero-sum Markov games …
(RL) in the context of general function approximations. Focusing on zero-sum Markov games …