The factored policy gradient planner (ipc-06 version)

S Sanner, C Boutilier - Artificial Intelligence, 2009 - Elsevier

Many traditional solution approaches to relationally specified decision-theoretic planning
problems (eg, those stated in the probabilistic planning domain description language, or …

被引用次数：134 相关文章所有 20 个版本

[PDF] arxiv.org

A theory of goal-oriented MDPs with dead ends

A Kolobov, D Weld - arXiv preprint arXiv:1210.4875, 2012 - arxiv.org

Stochastic Shortest Path (SSP) MDPs is a problem class widely studied in AI, especially in
probabilistic planning. They describe a wide range of scenarios but make the restrictive …

被引用次数：83 相关文章所有 12 个版本

[PDF] aaai.org

LRTDP versus UCT for online probabilistic planning

A Kolobov, D Weld - Proceedings of the AAAI Conference on Artificial …, 2012 - ojs.aaai.org

UCT, the premier method for solving games such as Go, is also becoming the dominant
algorithm for probabilistic planning. Out of the five solvers at the International Probabilistic …

被引用次数：45 相关文章所有 17 个版本

[PDF] hal.science

Shaping multi-agent systems with gradient reinforcement learning

O Buffet, A Dutech, F Charpillet - Autonomous Agents and Multi-Agent …, 2007 - Springer

An original reinforcement learning (RL) methodology is proposed for the design of multi-
agent systems. In the realistic setting of situated agents with local perception, the task of …

被引用次数：55 相关文章所有 14 个版本

[PDF] microsoft.com

ReTrASE: Intergating Paradigms for Approximate Probabilistic Planning

A Kolobov, DS Weld - IJCAI 2009, 2009 - microsoft.com

Past approaches for solving MDPs have several weaknesses: 1) Decision-theoretic
computation over the state space can yield optimal results but scales poorly. 2) Value …

被引用次数：43 相关文章所有 11 个版本

[PDF] aaai.org

[PDF][PDF] FF+ FPG: Guiding a Policy-Gradient Planner.

O Buffet, D Aberdeen - ICAPS, 2007 - cdn.aaai.org

Abstract The Factored Policy-Gradient planner (FPG)(Buffet & Aberdeen 2006) was a
successful competitor in the probabilistic track of the 2006 International Planning …

被引用次数：27 相关文章所有 11 个版本

[PDF] psu.edu

Combining policy search with planning in multi-agent cooperation

J Ma, S Cameron - RoboCup 2008: Robot Soccer World Cup XII 12, 2009 - Springer

It is cooperation that essentially differentiates multi-agent systems (MASs) from single-agent
intelligence. In realistic MAS applications such as RoboCup, repeated work has shown that …

被引用次数：17 相关文章所有 10 个版本

并行概率规划综述.

饶东宁，李建华，蒋志华… - Application Research of …, 2016 - search.ebscohost.com

自动规划针对特定领域的特定问题, 生成一个由可应用动作构成的规划. 经典规划中的动作效果
是确定的, 且在每个时间步内只能执行一个动作. 但在实际问题中, 动作的效果往往是不确定性的 …

被引用次数：2 相关文章所有 3 个版本

[PDF] plos.org

Decision making under uncertainty: a quasimetric approach

S N'Guyen, C Moulin-Frier, J Droulez - PloS one, 2013 - journals.plos.org

We propose a new approach for solving a class of discrete decision making problems under
uncertainty with positive cost. This issue concerns multiple and diverse fields such as …

被引用次数：9 相关文章所有 22 个版本

[PDF] tue.nl

A proposal for semantic recommender for outdoor audio tour guides

A Koren, N Stash, A Andreev - … Applications (PeMA 2011) at the 5th …, 2011 - research.tue.nl

Location-based services are widely spread both as entertainment and business
applications. The focus of this work is on one particular area–tourist-oriented information …

被引用次数：7 相关文章所有 6 个版本