Learning in observable pomdps, without computationally intractable oracles

N Golowich, A Moitra, D Rohatgi - Advances in neural …, 2022 - proceedings.neurips.cc
Much of reinforcement learning theory is built on top of oracles that are computationally hard
to implement. Specifically for learning near-optimal policies in Partially Observable Markov …

Partially observable Markov decision processes

MTJ Spaan - Reinforcement learning: State-of-the-art, 2012 - Springer
For reinforcement learning in environments in which an agent has access to a reliable state
signal, methods based on the Markov decision process (MDP) have had many successes. In …

Closing the learning-planning loop with predictive state representations

B Boots, SM Siddiqi, GJ Gordon - The International Journal …, 2011 - journals.sagepub.com
A central problem in artificial intelligence is to choose actions to maximize reward in a
partially observable, uncertain environment. To do so, we must learn an accurate …

Particle filter networks with application to visual localization

P Karkus, D Hsu, WS Lee - Conference on robot learning, 2018 - proceedings.mlr.press
Particle filtering is a powerful approach to sequential state estimation and finds application
in many domains, including robot localization, object tracking, etc. To apply particle filtering …

Qmdp-net: Deep learning for planning under partial observability

P Karkus, D Hsu, WS Lee - Advances in neural information …, 2017 - proceedings.neurips.cc
This paper introduces the QMDP-net, a neural network architecture for planning under
partial observability. The QMDP-net combines the strengths of model-free learning and …

Scalable planning and learning for multiagent POMDPs

C Amato, F Oliehoek - Proceedings of the AAAI Conference on Artificial …, 2015 - ojs.aaai.org
Online, sample-based planning algorithms for POMDPs have shown great promise in
scaling to problems with large state spaces, but they become intractable for large action and …

Online learning for unknown partially observable mdps

MJ Jahromi, R Jain, A Nayyar - International Conference on …, 2022 - proceedings.mlr.press
Abstract Solving Partially Observable Markov Decision Processes (POMDPs) is hard.
Learning optimal controllers for POMDPs when the model is unknown is harder. Online …

[图书][B] Probabilistic planning for robotic exploration

T Smith - 2007 - search.proquest.com
Robotic exploration tasks involve inherent uncertainty. They typically include navigating
through unknown terrain, searching for features that may or may not be present, and …

Robust partially observable Markov decision process

T Osogami - International Conference on Machine Learning, 2015 - proceedings.mlr.press
We seek to find the robust policy that maximizes the expected cumulative reward for the
worst case when a partially observable Markov decision process (POMDP) has uncertain …

Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs

F Doshi, J Pineau, N Roy - … of the 25th international conference on …, 2008 - dl.acm.org
Partially Observable Markov Decision Processes (POMDPs) have succeeded in planning
domains that require balancing actions that increase an agent's knowledge and actions that …