Model-based online learning of POMDPs

N Golowich, A Moitra, D Rohatgi - Advances in neural …, 2022 - proceedings.neurips.cc

Much of reinforcement learning theory is built on top of oracles that are computationally hard
to implement. Specifically for learning near-optimal policies in Partially Observable Markov …

被引用次数：33 相关文章所有 7 个版本

[PDF] utl.pt

Partially observable Markov decision processes

MTJ Spaan - Reinforcement learning: State-of-the-art, 2012 - Springer

For reinforcement learning in environments in which an agent has access to a reliable state
signal, methods based on the Markov decision process (MDP) have had many successes. In …

被引用次数：386 相关文章所有 16 个版本

[PDF] arxiv.org

Closing the learning-planning loop with predictive state representations

B Boots, SM Siddiqi, GJ Gordon - The International Journal …, 2011 - journals.sagepub.com

A central problem in artificial intelligence is to choose actions to maximize reward in a
partially observable, uncertain environment. To do so, we must learn an accurate …

被引用次数：283 相关文章所有 33 个版本

[PDF] mlr.press

Particle filter networks with application to visual localization

P Karkus, D Hsu, WS Lee - Conference on robot learning, 2018 - proceedings.mlr.press

Particle filtering is a powerful approach to sequential state estimation and finds application
in many domains, including robot localization, object tracking, etc. To apply particle filtering …

被引用次数：124 相关文章所有 7 个版本

[PDF] neurips.cc

Qmdp-net: Deep learning for planning under partial observability

P Karkus, D Hsu, WS Lee - Advances in neural information …, 2017 - proceedings.neurips.cc

This paper introduces the QMDP-net, a neural network architecture for planning under
partial observability. The QMDP-net combines the strengths of model-free learning and …

被引用次数：197 相关文章所有 11 个版本

[PDF] aaai.org

Scalable planning and learning for multiagent POMDPs

C Amato, F Oliehoek - Proceedings of the AAAI Conference on Artificial …, 2015 - ojs.aaai.org

Online, sample-based planning algorithms for POMDPs have shown great promise in
scaling to problems with large state spaces, but they become intractable for large action and …

被引用次数：128 相关文章所有 22 个版本

[PDF] mlr.press

Online learning for unknown partially observable mdps

MJ Jahromi, R Jain, A Nayyar - International Conference on …, 2022 - proceedings.mlr.press

Abstract Solving Partially Observable Markov Decision Processes (POMDPs) is hard.
Learning optimal controllers for POMDPs when the model is unknown is harder. Online …

被引用次数：28 相关文章所有 5 个版本

[PDF] psu.edu

[图书][B] Probabilistic planning for robotic exploration

T Smith - 2007 - search.proquest.com

Robotic exploration tasks involve inherent uncertainty. They typically include navigating
through unknown terrain, searching for features that may or may not be present, and …

被引用次数：106 相关文章所有 8 个版本

[PDF] mlr.press

Robust partially observable Markov decision process

T Osogami - International Conference on Machine Learning, 2015 - proceedings.mlr.press

We seek to find the robust policy that maximizes the expected cumulative reward for the
worst case when a partially observable Markov decision process (POMDP) has uncertain …

被引用次数：47 相关文章所有 7 个版本

[HTML] nih.gov

Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs

F Doshi, J Pineau, N Roy - … of the 25th international conference on …, 2008 - dl.acm.org

Partially Observable Markov Decision Processes (POMDPs) have succeeded in planning
domains that require balancing actions that increase an agent's knowledge and actions that …

被引用次数：92 相关文章所有 26 个版本