Learning in observable pomdps, without computationally intractable oracles
Much of reinforcement learning theory is built on top of oracles that are computationally hard
to implement. Specifically for learning near-optimal policies in Partially Observable Markov …
to implement. Specifically for learning near-optimal policies in Partially Observable Markov …
Partially observable Markov decision processes
MTJ Spaan - Reinforcement learning: State-of-the-art, 2012 - Springer
For reinforcement learning in environments in which an agent has access to a reliable state
signal, methods based on the Markov decision process (MDP) have had many successes. In …
signal, methods based on the Markov decision process (MDP) have had many successes. In …
Closing the learning-planning loop with predictive state representations
A central problem in artificial intelligence is to choose actions to maximize reward in a
partially observable, uncertain environment. To do so, we must learn an accurate …
partially observable, uncertain environment. To do so, we must learn an accurate …
Particle filter networks with application to visual localization
Particle filtering is a powerful approach to sequential state estimation and finds application
in many domains, including robot localization, object tracking, etc. To apply particle filtering …
in many domains, including robot localization, object tracking, etc. To apply particle filtering …
Qmdp-net: Deep learning for planning under partial observability
This paper introduces the QMDP-net, a neural network architecture for planning under
partial observability. The QMDP-net combines the strengths of model-free learning and …
partial observability. The QMDP-net combines the strengths of model-free learning and …
Scalable planning and learning for multiagent POMDPs
C Amato, F Oliehoek - Proceedings of the AAAI Conference on Artificial …, 2015 - ojs.aaai.org
Online, sample-based planning algorithms for POMDPs have shown great promise in
scaling to problems with large state spaces, but they become intractable for large action and …
scaling to problems with large state spaces, but they become intractable for large action and …
Online learning for unknown partially observable mdps
Abstract Solving Partially Observable Markov Decision Processes (POMDPs) is hard.
Learning optimal controllers for POMDPs when the model is unknown is harder. Online …
Learning optimal controllers for POMDPs when the model is unknown is harder. Online …
[图书][B] Probabilistic planning for robotic exploration
T Smith - 2007 - search.proquest.com
Robotic exploration tasks involve inherent uncertainty. They typically include navigating
through unknown terrain, searching for features that may or may not be present, and …
through unknown terrain, searching for features that may or may not be present, and …
Robust partially observable Markov decision process
T Osogami - International Conference on Machine Learning, 2015 - proceedings.mlr.press
We seek to find the robust policy that maximizes the expected cumulative reward for the
worst case when a partially observable Markov decision process (POMDP) has uncertain …
worst case when a partially observable Markov decision process (POMDP) has uncertain …
Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs
Partially Observable Markov Decision Processes (POMDPs) have succeeded in planning
domains that require balancing actions that increase an agent's knowledge and actions that …
domains that require balancing actions that increase an agent's knowledge and actions that …