- 学术资源搜索

Multi-agent reinforcement learning: A selective overview of theories and algorithms

K Zhang, Z Yang, T Başar - Handbook of reinforcement learning and …, 2021 - Springer

Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …

被引用次数：1638 相关文章所有 8 个版本

[PDF] arxiv.org

An overview of multi-agent reinforcement learning from game theoretical perspective

Y Yang, J Wang - arXiv preprint arXiv:2011.00583, 2020 - arxiv.org

Following the remarkable success of the AlphaGO series, 2019 was a booming year that
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …

被引用次数：340 相关文章所有 2 个版本

[PDF] science.org

Mastering the game of Stratego with model-free multiagent reinforcement learning

J Perolat, B De Vylder, D Hennes, E Tarassov, F Strub… - Science, 2022 - science.org

We introduce DeepNash, an autonomous agent that plays the imperfect information game
Stratego at a human expert level. Stratego is one of the few iconic board games that artificial …

被引用次数：230 相关文章所有 6 个版本

[PDF] ai-plans.com

[PDF][PDF] Nash learning from human feedback

R Munos, M Valko, D Calandriello, MG Azar… - arXiv preprint arXiv …, 2023 - ai-plans.com

Large language models (LLMs)(Anil et al., 2023; Glaese et al., 2022; OpenAI, 2023; Ouyang
et al., 2022) have made remarkable strides in enhancing natural language understanding …

被引用次数：82 相关文章所有 5 个版本

[PDF] neurips.cc

Independent policy gradient methods for competitive reinforcement learning

C Daskalakis, DJ Foster… - Advances in neural …, 2020 - proceedings.neurips.cc

We obtain global, non-asymptotic convergence guarantees for independent learning
algorithms in competitive reinforcement learning settings with two agents (ie, zero-sum …

被引用次数：202 相关文章所有 7 个版本

[PDF] arxiv.org

Language agents with reinforcement learning for strategic play in the werewolf game

Z Xu, C Yu, F Fang, Y Wang, Y Wu - arXiv preprint arXiv:2310.18940, 2023 - arxiv.org

Agents built with large language models (LLMs) have recently achieved great
advancements. However, most of the efforts focus on single-agent or cooperative settings …

被引用次数：54 相关文章所有 4 个版本

[PDF] neurips.cc

Fictitious play for mean field games: Continuous time analysis and applications

S Perrin, J Pérolat, M Laurière… - Advances in neural …, 2020 - proceedings.neurips.cc

In this paper, we deepen the analysis of continuous time Fictitious Play learning algorithm to
the consideration of various finite state Mean Field Game settings (finite horizon, $\gamma …

被引用次数：139 相关文章所有 11 个版本

[PDF] science.org Full View

Student of Games: A unified learning algorithm for both perfect and imperfect information games

M Schmid, M Moravčík, N Burch, R Kadlec… - Science …, 2023 - science.org

Games have a long history as benchmarks for progress in artificial intelligence. Approaches
using search and learning produced strong performance across many perfect information …

被引用次数：72 相关文章所有 8 个版本

[PDF] mlr.press

Independent natural policy gradient always converges in markov potential games

R Fox, SM Mcaleer, W Overman… - International …, 2022 - proceedings.mlr.press

Natural policy gradient has emerged as one of the most successful algorithms for computing
optimal policies in challenging Reinforcement Learning (RL) tasks, yet, very little was known …

被引用次数：58 相关文章所有 7 个版本

[PDF] neurips.cc

Escaping the gravitational pull of softmax

J Mei, C Xiao, B Dai, L Li… - Advances in …, 2020 - proceedings.neurips.cc

The softmax is the standard transformation used in machine learning to map real-valued
vectors to categorical distributions. Unfortunately, this transform poses serious drawbacks for …

被引用次数：59 相关文章所有 10 个版本