Learning to cooperate via policy search

S Arora, P Doshi - Artificial Intelligence, 2021 - Elsevier

Inverse reinforcement learning (IRL) is the problem of inferring the reward function of an
agent, given its policy or observed behavior. Analogous to RL, IRL is perceived both as a …

被引用次数：797 相关文章所有 6 个版本

[PDF] psu.edu

Cooperative multi-agent learning: The state of the art

L Panait, S Luke - Autonomous agents and multi-agent systems, 2005 - Springer

Cooperative multi-agent systems (MAS) are ones in which several agents attempt, through
their interaction, to jointly solve tasks or to maximize utility. Due to the interactions among the …

被引用次数：2006 相关文章所有 18 个版本

[PDF] universityofgalway.ie

Cooperative multi-agent control using deep reinforcement learning

JK Gupta, M Egorov, M Kochenderfer - … Best Papers, São Paulo, Brazil, May …, 2017 - Springer

This work considers the problem of learning cooperative policies in complex, partially
observable domains without explicit communication. We extend three classes of single …

被引用次数：1254 相关文章所有 3 个版本

[PDF] fransoliehoek.net

[图书][B] A concise introduction to decentralized POMDPs

FA Oliehoek, C Amato - 2016 - Springer

This book presents an overview of formal decision making methods for decentralized
cooperative systems. It is aimed at graduate students and researchers in the fields of …

被引用次数：1363 相关文章所有 13 个版本

[PDF] mlr.press

Deep decentralized multi-task multi-agent reinforcement learning under partial observability

S Omidshafiei, J Pazis, C Amato… - … on Machine Learning, 2017 - proceedings.mlr.press

Many real-world tasks involve multiple agents with partial observability and limited
communication. Learning is challenging in these settings due to local viewpoints of agents …

被引用次数：667 相关文章所有 7 个版本

[PDF] arxiv.org

Contrasting centralized and decentralized critics in multi-agent reinforcement learning

X Lyu, Y Xiao, B Daley, C Amato - arXiv preprint arXiv:2102.04402, 2021 - arxiv.org

Centralized Training for Decentralized Execution, where agents are trained offline using
centralized information but execute in a decentralized manner online, has gained popularity …

被引用次数：156 相关文章所有 7 个版本

Reinforcement learning

MA Wiering, M Van Otterlo - Adaptation, learning, and optimization, 2012 - Springer

Reinforcement learning Marco Wiering Martijn van Otterlo (Eds.) Reinforcement Learning
State-of-the-Art ADAPTATION, LEARNING, AND OPTIMIZATION Volume 12 123 Page 2 …

被引用次数：1436 相关文章所有 8 个版本

[PDF] hal.science

Independent reinforcement learners in cooperative markov games: a survey regarding coordination problems

L Matignon, GJ Laurent, N Le Fort-Piat - The Knowledge …, 2012 - cambridge.org

In the framework of fully cooperative multi-agent systems, independent (non-communicative)
agents that learn by reinforcement must overcome several difficulties to manage to …

被引用次数：576 相关文章所有 13 个版本

[PDF] psu.edu

The complexity of decentralized control of Markov decision processes

DS Bernstein, R Givan, N Immerman… - Mathematics of …, 2002 - pubsonline.informs.org

We consider decentralized control of Markov decision processes and give complexity
bounds on the worst-case running time for algorithms that find optimal solutions …

被引用次数：2157 相关文章所有 26 个版本

[PDF] jair.org

Infinite-horizon policy-gradient estimation

J Baxter, PL Bartlett - journal of artificial intelligence research, 2001 - jair.org

Gradient-based approaches to direct policy search in reinforcement learning have received
much recent attention as a means to solve problems of partial observability and to avoid …

被引用次数：1260 相关文章所有 34 个版本