When can we learn general-sum Markov games with a large number of players sample-efficiently?

Z Song, S Mei, Y Bai - arXiv preprint arXiv:2110.04184, 2021 - arxiv.org
Multi-agent reinforcement learning has made substantial empirical progresses in solving
games with a large number of players. However, theoretically, the best known sample …

Near-optimal no-regret learning for correlated equilibria in multi-player general-sum games

I Anagnostides, C Daskalakis, G Farina… - Proceedings of the 54th …, 2022 - dl.acm.org
Recently, Daskalakis, Fishelson, and Golowich (DFG)(NeurIPS '21) showed that if all agents
in a multi-player general-sum normal-form game employ Optimistic Multiplicative Weights …

Breaking the curse of multiagents in a large state space: Rl in markov games with independent linear function approximation

Q Cui, K Zhang, S Du - The Thirty Sixth Annual Conference …, 2023 - proceedings.mlr.press
We propose a new model,\emph {independent linear Markov game}, for multi-agent
reinforcement learning with a large state space and a large number of agents. This is a class …

[PDF][PDF] From External to Swap Regret 2.0: An Efficient Reduction for Large Action Spaces

Y Dagan, C Daskalakis, M Fishelson… - Proceedings of the 56th …, 2024 - dl.acm.org
We provide a novel reduction from swap-regret minimization to external-regret minimization,
which improves upon the classical reductions of Blum-Mansour and Stoltz-Lugosi in that it …

Oracle efficient online multicalibration and omniprediction

S Garg, C Jung, O Reingold, A Roth - Proceedings of the 2024 Annual ACM …, 2024 - SIAM
A recent line of work has shown a surprising connection between multicalibration, a multi-
group fairness notion, and omniprediction, a learning paradigm that provides simultaneous …

Fast swap regret minimization and applications to approximate correlated equilibria

B Peng, A Rubinstein - Proceedings of the 56th Annual ACM Symposium …, 2024 - dl.acm.org
We give a simple and computationally efficient algorithm that, for any constant ε> 0, obtains ε
T-swap regret within only T=(n) rounds; this is an exponential improvement compared to the …

Persuading a learning agent

T Lin, Y Chen - arXiv preprint arXiv:2402.09721, 2024 - arxiv.org
We study a repeated Bayesian persuasion problem (and more generally, any generalized
principal-agent problem with complete information) where the principal does not have …

A near-optimal high-probability swap-regret upper bound for multi-agent bandits in unknown general-sum games

Z Huang, J Pan - Uncertainty in Artificial Intelligence, 2023 - proceedings.mlr.press
In this paper, we study a multi-agent bandit problem in an unknown general-sum game
repeated for a number of rounds (ie, learning in a black-box game with bandit feedback) …

Population-based evaluation in repeated rock-paper-scissors as a benchmark for multiagent reinforcement learning

M Lanctot, J Schultz, N Burch, MO Smith… - arXiv preprint arXiv …, 2023 - arxiv.org
Progress in fields of machine learning and adversarial planning has benefited significantly
from benchmark domains, from checkers and the classic UCI data sets to Go and Diplomacy …

End-to-End Congestion Control as Learning for Unknown Games with Bandit Feedback

Z Huang, K Liu, J Pan - 2023 IEEE 43rd International …, 2023 - ieeexplore.ieee.org
In this paper, we study the open problems raised by Karp et al. in FOCS 2000, where the
authors formulated the end-to-end congestion control as a repeated game between a flow …