When can we learn general-sum Markov games with a large number of players sample-efficiently?
Multi-agent reinforcement learning has made substantial empirical progresses in solving
games with a large number of players. However, theoretically, the best known sample …
games with a large number of players. However, theoretically, the best known sample …
Near-optimal no-regret learning for correlated equilibria in multi-player general-sum games
Recently, Daskalakis, Fishelson, and Golowich (DFG)(NeurIPS '21) showed that if all agents
in a multi-player general-sum normal-form game employ Optimistic Multiplicative Weights …
in a multi-player general-sum normal-form game employ Optimistic Multiplicative Weights …
Breaking the curse of multiagents in a large state space: Rl in markov games with independent linear function approximation
We propose a new model,\emph {independent linear Markov game}, for multi-agent
reinforcement learning with a large state space and a large number of agents. This is a class …
reinforcement learning with a large state space and a large number of agents. This is a class …
[PDF][PDF] From External to Swap Regret 2.0: An Efficient Reduction for Large Action Spaces
We provide a novel reduction from swap-regret minimization to external-regret minimization,
which improves upon the classical reductions of Blum-Mansour and Stoltz-Lugosi in that it …
which improves upon the classical reductions of Blum-Mansour and Stoltz-Lugosi in that it …
Oracle efficient online multicalibration and omniprediction
A recent line of work has shown a surprising connection between multicalibration, a multi-
group fairness notion, and omniprediction, a learning paradigm that provides simultaneous …
group fairness notion, and omniprediction, a learning paradigm that provides simultaneous …
Fast swap regret minimization and applications to approximate correlated equilibria
B Peng, A Rubinstein - Proceedings of the 56th Annual ACM Symposium …, 2024 - dl.acm.org
We give a simple and computationally efficient algorithm that, for any constant ε> 0, obtains ε
T-swap regret within only T=(n) rounds; this is an exponential improvement compared to the …
T-swap regret within only T=(n) rounds; this is an exponential improvement compared to the …
Persuading a learning agent
T Lin, Y Chen - arXiv preprint arXiv:2402.09721, 2024 - arxiv.org
We study a repeated Bayesian persuasion problem (and more generally, any generalized
principal-agent problem with complete information) where the principal does not have …
principal-agent problem with complete information) where the principal does not have …
A near-optimal high-probability swap-regret upper bound for multi-agent bandits in unknown general-sum games
In this paper, we study a multi-agent bandit problem in an unknown general-sum game
repeated for a number of rounds (ie, learning in a black-box game with bandit feedback) …
repeated for a number of rounds (ie, learning in a black-box game with bandit feedback) …
Population-based evaluation in repeated rock-paper-scissors as a benchmark for multiagent reinforcement learning
Progress in fields of machine learning and adversarial planning has benefited significantly
from benchmark domains, from checkers and the classic UCI data sets to Go and Diplomacy …
from benchmark domains, from checkers and the classic UCI data sets to Go and Diplomacy …
End-to-End Congestion Control as Learning for Unknown Games with Bandit Feedback
In this paper, we study the open problems raised by Karp et al. in FOCS 2000, where the
authors formulated the end-to-end congestion control as a repeated game between a flow …
authors formulated the end-to-end congestion control as a repeated game between a flow …