Consistency and cautious fictitious play

K Zhang, Z Yang, T Başar - Handbook of reinforcement learning and …, 2021 - Springer

Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …

被引用次数：1668 相关文章所有 8 个版本

[PDF] arxiv.org

An overview of multi-agent reinforcement learning from game theoretical perspective

Y Yang, J Wang - arXiv preprint arXiv:2011.00583, 2020 - arxiv.org

Following the remarkable success of the AlphaGO series, 2019 was a booming year that
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …

被引用次数：342 相关文章所有 2 个版本

[PDF] arxiv.org

A modern introduction to online learning

F Orabona - arXiv preprint arXiv:1912.13213, 2019 - arxiv.org

In this monograph, I introduce the basic concepts of Online Learning through a modern view
of Online Convex Optimization. Here, online learning refers to the framework of regret …

被引用次数：409 相关文章所有 3 个版本

[PDF] archive.org

[图书][B] Partially observed Markov decision processes

V Krishnamurthy - 2016 - books.google.com

Covering formulation, algorithms, and structural results, and linking theory to real-world
applications in controlled sensing (including social learning, adaptive radars and sequential …

被引用次数：468 相关文章所有 5 个版本

[PDF] princeton.edu

Boosting: Foundations and algorithms

RE Schapire, Y Freund - Kybernetes, 2013 - emerald.com

The term “boosting” denotes a powerful means of facilitating machine learning that was
invented by the book's authors 20 years ago and intensively developed since. Despite this …

被引用次数：1555 相关文章所有 12 个版本

[PDF] academia.edu

Potential games

D Monderer, LS Shapley - Games and economic behavior, 1996 - Elsevier

Potential Games Page 1 GAMES AND ECONOMIC BEHAVIOR 14, 124–143 (1996)
ARTICLE NO. 0044 Potential Games Dov Monderer ∗ Faculty of Industrial Engineering and …

被引用次数：5389 相关文章所有 33 个版本

[PDF] uni.wroc.pl

[图书][B] Prediction, learning, and games

N Cesa-Bianchi, G Lugosi - 2006 - books.google.com

This important text and reference for researchers and students in machine learning, game
theory, statistics and information theory offers a comprehensive treatment of the problem of …

被引用次数：5135 相关文章所有 14 个版本

[PDF] psu.edu

The nonstochastic multiarmed bandit problem

P Auer, N Cesa-Bianchi, Y Freund, RE Schapire - SIAM journal on computing, 2002 - SIAM

In the multiarmed bandit problem, a gambler must decide which arm of K nonidentical slot
machines to play in a sequence of trials so as to maximize his reward. This classical …

被引用次数：3211 相关文章所有 29 个版本

[PDF] aaai.org

[PDF][PDF] Online convex programming and generalized infinitesimal gradient ascent

M Zinkevich - Proceedings of the 20th international conference on …, 2003 - cdn.aaai.org

Convex programming involves a convex set F⊆ Rn and a convex cost function c: F→ R. The
goal of convex programming is to find a point in F which minimizes c. In online convex …

被引用次数：3024 相关文章所有 15 个版本

[图书][B] Robustness

LP Hansen, TJ Sargent - 2008 - degruyter.com

The standard theory of decision making under uncertainty advises the decision maker to
form a statistical model linking outcomes to decisions and then to choose the optimal …

被引用次数：1819 相关文章所有 7 个版本