Multi-agent reinforcement learning: A selective overview of theories and algorithms

K Zhang, Z Yang, T Başar - Handbook of reinforcement learning and …, 2021 - Springer
Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …

An overview of multi-agent reinforcement learning from game theoretical perspective

Y Yang, J Wang - arXiv preprint arXiv:2011.00583, 2020 - arxiv.org
Following the remarkable success of the AlphaGO series, 2019 was a booming year that
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …

Cycles in adversarial regularized learning

P Mertikopoulos, C Papadimitriou, G Piliouras - Proceedings of the twenty …, 2018 - SIAM
Regularized learning is a fundamental technique in online optimization, machine learning,
and many other fields of computer science. A natural question that arises in this context is …

Multi-agent reinforcement learning: An overview

L Buşoniu, R Babuška, B De Schutter - Innovations in multi-agent systems …, 2010 - Springer
Multi-agent systems can be used to address problems in a variety of domains, including
robotics, distributed control, telecommunications, and economics. The complexity of many …

Learning in games with continuous action sets and unknown payoff functions

P Mertikopoulos, Z Zhou - Mathematical Programming, 2019 - Springer
This paper examines the convergence of no-regret learning in games with continuous action
sets. For concreteness, we focus on learning via “dual averaging”, a widely used class of no …

On improving model-free algorithms for decentralized multi-agent reinforcement learning

W Mao, L Yang, K Zhang… - … Conference on Machine …, 2022 - proceedings.mlr.press
Multi-agent reinforcement learning (MARL) algorithms often suffer from an exponential
sample complexity dependence on the number of agents, a phenomenon known as the …

Tight last-iterate convergence rates for no-regret learning in multi-player games

N Golowich, S Pattathil… - Advances in neural …, 2020 - proceedings.neurips.cc
We study the question of obtaining last-iterate convergence rates for no-regret learning
algorithms in multi-player games. We show that the optimistic gradient (OG) algorithm with a …

Adaptive learning in continuous games: Optimal regret bounds and convergence to nash equilibrium

YG Hsieh, K Antonakopoulos… - … on Learning Theory, 2021 - proceedings.mlr.press
In game-theoretic learning, several agents are simultaneously following their individual
interests, so the environment is non-stationary from each player's perspective. In this context …

Bandit learning in concave N-person games

M Bravo, D Leslie… - Advances in Neural …, 2018 - proceedings.neurips.cc
This paper examines the long-run behavior of learning with bandit feedback in non-
cooperative concave games. The bandit framework accounts for extremely low-information …

Learning in games via reinforcement and regularization

P Mertikopoulos, WH Sandholm - Mathematics of Operations …, 2016 - pubsonline.informs.org
We investigate a class of reinforcement learning dynamics where players adjust their
strategies based on their actions' cumulative payoffs over time—specifically, by playing …