Efficient computation of equilibria for extensive two-person games

K Zhang, Z Yang, T Başar - Handbook of reinforcement learning and …, 2021 - Springer

Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …

被引用次数：1427 相关文章所有 8 个版本

[PDF] psu.edu

Computation of equilibria in finite games

RD McKelvey, A McLennan - Handbook of computational economics, 1996 - Elsevier

Publisher Summary This chapter provides an overview of the latest state of the art of
methods for numerical computation of Nash equilibria—and refinements of Nash equilibria …

被引用次数：489 相关文章所有 14 个版本

[HTML] acm.org

Heads-up limit hold'em poker is solved

M Bowling, N Burch, M Johanson, O Tammelin - Science, 2015 - science.org

Poker is a family of games that exhibit imperfect information, where players do not have full
knowledge of past events. Whereas many perfect-information games have been solved (eg …

被引用次数：591 相关文章所有 15 个版本

[PDF] academia.edu

[图书][B] Planning algorithms

SM LaValle - 2006 - books.google.com

Planning algorithms are impacting technical disciplines and industries around the world,
including robotics, computer-aided design, manufacturing, computer graphics, aerospace …

被引用次数：10351 相关文章所有 11 个版本

[PDF] arxiv.org

A unified approach to reinforcement learning, quantal response equilibria, and two-player zero-sum games

S Sokota, R D'Orazio, JZ Kolter, N Loizou… - arXiv preprint arXiv …, 2022 - arxiv.org

This work studies an algorithm, which we call magnetic mirror descent, that is inspired by
mirror descent and the non-Euclidean proximal gradient algorithm. Our contribution is …

被引用次数：50 相关文章所有 4 个版本

[PDF] mlr.press

On last-iterate convergence beyond zero-sum games

I Anagnostides, I Panageas, G Farina… - International …, 2022 - proceedings.mlr.press

Most existing results about last-iterate convergence of learning dynamics are limited to two-
player zero-sum games, and only apply under rigid assumptions about what dynamics the …

被引用次数：41 相关文章所有 8 个版本

[PDF] psu.edu

Computing the optimal strategy to commit to

V Conitzer, T Sandholm - Proceedings of the 7th ACM conference on …, 2006 - dl.acm.org

In multiagent systems, strategic settings are often analyzed under the assumption that the
players choose their strategies simultaneously. However, this model is not always realistic …

被引用次数：625 相关文章所有 12 个版本

[PDF] mlr.press

Fictitious self-play in extensive-form games

J Heinrich, M Lanctot, D Silver - International conference on …, 2015 - proceedings.mlr.press

Fictitious play is a popular game-theoretic model of learning in games. However, it has
received little attention in practical applications to large problems. This paper introduces two …

被引用次数：385 相关文章所有 17 个版本

[PDF] uci.edu

[图书][B] Algorithms for sequential decision-making

ML Littman - 1996 - search.proquest.com

Sequential decision making is a fundamental task faced by any intelligent agent in an
extended interaction with its environment; it is the act of answering the question" What …

被引用次数：597 相关文章所有 17 个版本

[PDF] aueb.gr

Playing large games using simple strategies

RJ Lipton, E Markakis, A Mehta - … of the 4th ACM Conference on …, 2003 - dl.acm.org

We prove the existence of ε-Nash equilibrium strategies with support logarithmic in the
number of pure strategies. We also show that the payoffs to all players in any (exact) Nash …

被引用次数：460 相关文章所有 14 个版本