Multi-agent reinforcement learning: A selective overview of theories and algorithms

K Zhang, Z Yang, T Başar - Handbook of reinforcement learning and …, 2021 - Springer
Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …

[HTML][HTML] Deep multiagent reinforcement learning: Challenges and directions

A Wong, T Bäck, AV Kononova, A Plaat - Artificial Intelligence Review, 2023 - Springer
This paper surveys the field of deep multiagent reinforcement learning (RL). The
combination of deep neural networks with RL has gained increased traction in recent years …

A survey and critique of multiagent deep reinforcement learning

P Hernandez-Leal, B Kartal, ME Taylor - Autonomous Agents and Multi …, 2019 - Springer
Deep reinforcement learning (RL) has achieved outstanding results in recent years. This has
led to a dramatic increase in the number of applications and methods. Recent works have …

A unified game-theoretic approach to multiagent reinforcement learning

M Lanctot, V Zambaldi, A Gruslys… - Advances in neural …, 2017 - proceedings.neurips.cc
There has been a resurgence of interest in multiagent reinforcement learning (MARL), due
partly to the recent success of deep neural networks. The simplest form of MARL is …

[图书][B] A concise introduction to decentralized POMDPs

FA Oliehoek, C Amato - 2016 - Springer
This book presents an overview of formal decision making methods for decentralized
cooperative systems. It is aimed at graduate students and researchers in the fields of …

[PDF][PDF] Is multiagent deep reinforcement learning the answer or the question? A brief survey

P Hernandez-Leal, B Kartal, ME Taylor - learning, 2018 - researchgate.net
Deep reinforcement learning (RL) has achieved outstanding results in recent years. This has
led to a dramatic increase in the number of applications and methods. Recent works have …

[HTML][HTML] Risk-aware shielding of partially observable monte carlo planning policies

G Mazzi, A Castellini, A Farinelli - Artificial Intelligence, 2023 - Elsevier
Abstract Partially Observable Monte Carlo Planning (POMCP) is a powerful online algorithm
that can generate approximate policies for large Partially Observable Markov Decision …

Safe policy synthesis in multi-agent POMDPs via discrete-time barrier functions

M Ahmadi, A Singletary, JW Burdick… - 2019 IEEE 58th …, 2019 - ieeexplore.ieee.org
A multi-agent partially observable Markov decision process (MPOMDP) is a modeling
paradigm used for high-level planning of heterogeneous autonomous agents subject to …

Learning in POMDPs with Monte Carlo tree search

S Katt, FA Oliehoek, C Amato - International Conference on …, 2017 - proceedings.mlr.press
The POMDP is a powerful framework for reasoning under outcome and information
uncertainty, but constructing an accurate POMDP model is difficult. Bayes-Adaptive Partially …

Cooperative traffic signal control using multi-step return and off-policy asynchronous advantage actor-critic graph algorithm

S Yang, B Yang, HS Wong, Z Kang - Knowledge-Based Systems, 2019 - Elsevier
Intelligent traffic signal control helps to reduce traffic congestion and thus has been studied
for a few decades. Multi-intersection cooperative traffic signal control (CTSC), which is more …