Multi-agent reinforcement learning: A selective overview of theories and algorithms

K Zhang, Z Yang, T Başar - Handbook of reinforcement learning and …, 2021 - Springer
Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …

Computation of equilibria in finite games

RD McKelvey, A McLennan - Handbook of computational economics, 1996 - Elsevier
Publisher Summary This chapter provides an overview of the latest state of the art of
methods for numerical computation of Nash equilibria—and refinements of Nash equilibria …

Heads-up limit hold'em poker is solved

M Bowling, N Burch, M Johanson, O Tammelin - Science, 2015 - science.org
Poker is a family of games that exhibit imperfect information, where players do not have full
knowledge of past events. Whereas many perfect-information games have been solved (eg …

[图书][B] Planning algorithms

SM LaValle - 2006 - books.google.com
Planning algorithms are impacting technical disciplines and industries around the world,
including robotics, computer-aided design, manufacturing, computer graphics, aerospace …

A unified approach to reinforcement learning, quantal response equilibria, and two-player zero-sum games

S Sokota, R D'Orazio, JZ Kolter, N Loizou… - arXiv preprint arXiv …, 2022 - arxiv.org
This work studies an algorithm, which we call magnetic mirror descent, that is inspired by
mirror descent and the non-Euclidean proximal gradient algorithm. Our contribution is …

On last-iterate convergence beyond zero-sum games

I Anagnostides, I Panageas, G Farina… - International …, 2022 - proceedings.mlr.press
Most existing results about last-iterate convergence of learning dynamics are limited to two-
player zero-sum games, and only apply under rigid assumptions about what dynamics the …

Computing the optimal strategy to commit to

V Conitzer, T Sandholm - Proceedings of the 7th ACM conference on …, 2006 - dl.acm.org
In multiagent systems, strategic settings are often analyzed under the assumption that the
players choose their strategies simultaneously. However, this model is not always realistic …

Fictitious self-play in extensive-form games

J Heinrich, M Lanctot, D Silver - International conference on …, 2015 - proceedings.mlr.press
Fictitious play is a popular game-theoretic model of learning in games. However, it has
received little attention in practical applications to large problems. This paper introduces two …

[图书][B] Algorithms for sequential decision-making

ML Littman - 1996 - search.proquest.com
Sequential decision making is a fundamental task faced by any intelligent agent in an
extended interaction with its environment; it is the act of answering the question" What …

Playing large games using simple strategies

RJ Lipton, E Markakis, A Mehta - … of the 4th ACM Conference on …, 2003 - dl.acm.org
We prove the existence of ε-Nash equilibrium strategies with support logarithmic in the
number of pure strategies. We also show that the payoffs to all players in any (exact) Nash …