Multi-agent reinforcement learning: A selective overview of theories and algorithms

K Zhang, Z Yang, T Başar - Handbook of reinforcement learning and …, 2021 - Springer
Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …

An overview of multi-agent reinforcement learning from game theoretical perspective

Y Yang, J Wang - arXiv preprint arXiv:2011.00583, 2020 - arxiv.org
Following the remarkable success of the AlphaGO series, 2019 was a booming year that
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …

Mastering the game of Stratego with model-free multiagent reinforcement learning

J Perolat, B De Vylder, D Hennes, E Tarassov, F Strub… - Science, 2022 - science.org
We introduce DeepNash, an autonomous agent that plays the imperfect information game
Stratego at a human expert level. Stratego is one of the few iconic board games that artificial …

[PDF][PDF] Nash learning from human feedback

R Munos, M Valko, D Calandriello, MG Azar… - arXiv preprint arXiv …, 2023 - ai-plans.com
Large language models (LLMs)(Anil et al., 2023; Glaese et al., 2022; OpenAI, 2023; Ouyang
et al., 2022) have made remarkable strides in enhancing natural language understanding …

Independent policy gradient methods for competitive reinforcement learning

C Daskalakis, DJ Foster… - Advances in neural …, 2020 - proceedings.neurips.cc
We obtain global, non-asymptotic convergence guarantees for independent learning
algorithms in competitive reinforcement learning settings with two agents (ie, zero-sum …

Language agents with reinforcement learning for strategic play in the werewolf game

Z Xu, C Yu, F Fang, Y Wang, Y Wu - arXiv preprint arXiv:2310.18940, 2023 - arxiv.org
Agents built with large language models (LLMs) have recently achieved great
advancements. However, most of the efforts focus on single-agent or cooperative settings …

Fictitious play for mean field games: Continuous time analysis and applications

S Perrin, J Pérolat, M Laurière… - Advances in neural …, 2020 - proceedings.neurips.cc
In this paper, we deepen the analysis of continuous time Fictitious Play learning algorithm to
the consideration of various finite state Mean Field Game settings (finite horizon, $\gamma …

Student of Games: A unified learning algorithm for both perfect and imperfect information games

M Schmid, M Moravčík, N Burch, R Kadlec… - Science …, 2023 - science.org
Games have a long history as benchmarks for progress in artificial intelligence. Approaches
using search and learning produced strong performance across many perfect information …

Independent natural policy gradient always converges in markov potential games

R Fox, SM Mcaleer, W Overman… - International …, 2022 - proceedings.mlr.press
Natural policy gradient has emerged as one of the most successful algorithms for computing
optimal policies in challenging Reinforcement Learning (RL) tasks, yet, very little was known …

Escaping the gravitational pull of softmax

J Mei, C Xiao, B Dai, L Li… - Advances in …, 2020 - proceedings.neurips.cc
The softmax is the standard transformation used in machine learning to map real-valued
vectors to categorical distributions. Unfortunately, this transform poses serious drawbacks for …