Multi-agent reinforcement learning: A selective overview of theories and algorithms
Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …
has registered tremendous success in solving various sequential decision-making problems …
An overview of multi-agent reinforcement learning from game theoretical perspective
Y Yang, J Wang - arXiv preprint arXiv:2011.00583, 2020 - arxiv.org
Following the remarkable success of the AlphaGO series, 2019 was a booming year that
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …
A modern introduction to online learning
F Orabona - arXiv preprint arXiv:1912.13213, 2019 - arxiv.org
In this monograph, I introduce the basic concepts of Online Learning through a modern view
of Online Convex Optimization. Here, online learning refers to the framework of regret …
of Online Convex Optimization. Here, online learning refers to the framework of regret …
[图书][B] Partially observed Markov decision processes
V Krishnamurthy - 2016 - books.google.com
Covering formulation, algorithms, and structural results, and linking theory to real-world
applications in controlled sensing (including social learning, adaptive radars and sequential …
applications in controlled sensing (including social learning, adaptive radars and sequential …
Boosting: Foundations and algorithms
RE Schapire, Y Freund - Kybernetes, 2013 - emerald.com
The term “boosting” denotes a powerful means of facilitating machine learning that was
invented by the book's authors 20 years ago and intensively developed since. Despite this …
invented by the book's authors 20 years ago and intensively developed since. Despite this …
Potential games
D Monderer, LS Shapley - Games and economic behavior, 1996 - Elsevier
Potential Games Page 1 GAMES AND ECONOMIC BEHAVIOR 14, 124–143 (1996)
ARTICLE NO. 0044 Potential Games Dov Monderer ∗ Faculty of Industrial Engineering and …
ARTICLE NO. 0044 Potential Games Dov Monderer ∗ Faculty of Industrial Engineering and …
[图书][B] Prediction, learning, and games
N Cesa-Bianchi, G Lugosi - 2006 - books.google.com
This important text and reference for researchers and students in machine learning, game
theory, statistics and information theory offers a comprehensive treatment of the problem of …
theory, statistics and information theory offers a comprehensive treatment of the problem of …
The nonstochastic multiarmed bandit problem
In the multiarmed bandit problem, a gambler must decide which arm of K nonidentical slot
machines to play in a sequence of trials so as to maximize his reward. This classical …
machines to play in a sequence of trials so as to maximize his reward. This classical …
[PDF][PDF] Online convex programming and generalized infinitesimal gradient ascent
M Zinkevich - Proceedings of the 20th international conference on …, 2003 - cdn.aaai.org
Convex programming involves a convex set F⊆ Rn and a convex cost function c: F→ R. The
goal of convex programming is to find a point in F which minimizes c. In online convex …
goal of convex programming is to find a point in F which minimizes c. In online convex …
[图书][B] Robustness
LP Hansen, TJ Sargent - 2008 - degruyter.com
The standard theory of decision making under uncertainty advises the decision maker to
form a statistical model linking outcomes to decisions and then to choose the optimal …
form a statistical model linking outcomes to decisions and then to choose the optimal …