A survey of inverse reinforcement learning: Challenges, methods and progress

S Arora, P Doshi - Artificial Intelligence, 2021 - Elsevier
Inverse reinforcement learning (IRL) is the problem of inferring the reward function of an
agent, given its policy or observed behavior. Analogous to RL, IRL is perceived both as a …

Cooperative multi-agent learning: The state of the art

L Panait, S Luke - Autonomous agents and multi-agent systems, 2005 - Springer
Cooperative multi-agent systems (MAS) are ones in which several agents attempt, through
their interaction, to jointly solve tasks or to maximize utility. Due to the interactions among the …

Cooperative multi-agent control using deep reinforcement learning

JK Gupta, M Egorov, M Kochenderfer - … Best Papers, São Paulo, Brazil, May …, 2017 - Springer
This work considers the problem of learning cooperative policies in complex, partially
observable domains without explicit communication. We extend three classes of single …

[图书][B] A concise introduction to decentralized POMDPs

FA Oliehoek, C Amato - 2016 - Springer
This book presents an overview of formal decision making methods for decentralized
cooperative systems. It is aimed at graduate students and researchers in the fields of …

Deep decentralized multi-task multi-agent reinforcement learning under partial observability

S Omidshafiei, J Pazis, C Amato… - … on Machine Learning, 2017 - proceedings.mlr.press
Many real-world tasks involve multiple agents with partial observability and limited
communication. Learning is challenging in these settings due to local viewpoints of agents …

Contrasting centralized and decentralized critics in multi-agent reinforcement learning

X Lyu, Y Xiao, B Daley, C Amato - arXiv preprint arXiv:2102.04402, 2021 - arxiv.org
Centralized Training for Decentralized Execution, where agents are trained offline using
centralized information but execute in a decentralized manner online, has gained popularity …

Reinforcement learning

MA Wiering, M Van Otterlo - Adaptation, learning, and optimization, 2012 - Springer
Reinforcement learning Marco Wiering Martijn van Otterlo (Eds.) Reinforcement Learning
State-of-the-Art ADAPTATION, LEARNING, AND OPTIMIZATION Volume 12 123 Page 2 …

Independent reinforcement learners in cooperative markov games: a survey regarding coordination problems

L Matignon, GJ Laurent, N Le Fort-Piat - The Knowledge …, 2012 - cambridge.org
In the framework of fully cooperative multi-agent systems, independent (non-communicative)
agents that learn by reinforcement must overcome several difficulties to manage to …

The complexity of decentralized control of Markov decision processes

DS Bernstein, R Givan, N Immerman… - Mathematics of …, 2002 - pubsonline.informs.org
We consider decentralized control of Markov decision processes and give complexity
bounds on the worst-case running time for algorithms that find optimal solutions …

Infinite-horizon policy-gradient estimation

J Baxter, PL Bartlett - journal of artificial intelligence research, 2001 - jair.org
Gradient-based approaches to direct policy search in reinforcement learning have received
much recent attention as a means to solve problems of partial observability and to avoid …