Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile
Owing to their connection with generative adversarial networks (GANs), saddle-point
problems have recently attracted considerable interest in machine learning and beyond. By …
problems have recently attracted considerable interest in machine learning and beyond. By …
Independent policy gradient for large-scale markov potential games: Sharper rates, function approximation, and game-agnostic convergence
We examine global non-asymptotic convergence properties of policy gradient methods for
multi-agent reinforcement learning (RL) problems in Markov potential games (MPGs). To …
multi-agent reinforcement learning (RL) problems in Markov potential games (MPGs). To …
Global convergence of multi-agent policy gradient in markov potential games
Potential games are arguably one of the most important and widely studied classes of
normal form games. They define the archetypal setting of multi-agent coordination as all …
normal form games. They define the archetypal setting of multi-agent coordination as all …
When can we learn general-sum Markov games with a large number of players sample-efficiently?
Multi-agent reinforcement learning has made substantial empirical progresses in solving
games with a large number of players. However, theoretically, the best known sample …
games with a large number of players. However, theoretically, the best known sample …
On improving model-free algorithms for decentralized multi-agent reinforcement learning
Multi-agent reinforcement learning (MARL) algorithms often suffer from an exponential
sample complexity dependence on the number of agents, a phenomenon known as the …
sample complexity dependence on the number of agents, a phenomenon known as the …
On last-iterate convergence beyond zero-sum games
Most existing results about last-iterate convergence of learning dynamics are limited to two-
player zero-sum games, and only apply under rigid assumptions about what dynamics the …
player zero-sum games, and only apply under rigid assumptions about what dynamics the …
Distributed multi-player bandits-a game of thrones approach
I Bistritz, A Leshem - Advances in Neural Information …, 2018 - proceedings.neurips.cc
We consider a multi-armed bandit game where N players compete for K arms for T turns.
Each player has different expected rewards for the arms, and the instantaneous rewards are …
Each player has different expected rewards for the arms, and the instantaneous rewards are …
Bandit learning in concave N-person games
This paper examines the long-run behavior of learning with bandit feedback in non-
cooperative concave games. The bandit framework accounts for extremely low-information …
cooperative concave games. The bandit framework accounts for extremely low-information …
The limits of min-max optimization algorithms: Convergence to spurious non-critical sets
YP Hsieh, P Mertikopoulos… - … Conference on Machine …, 2021 - proceedings.mlr.press
Compared to minimization, the min-max optimization in machine learning applications is
considerably more convoluted because of the existence of cycles and similar phenomena …
considerably more convoluted because of the existence of cycles and similar phenomena …
The confluence of networks, games, and learning a game-theoretic framework for multiagent decision making over networks
Multiagent decision making over networks has recently attracted an exponentially growing
number of researchers from the systems and control community. The area has gained …
number of researchers from the systems and control community. The area has gained …