Dynamic regret of policy optimization in non-stationary environments

Y Fei, Z Yang, Z Wang, Q Xie - Advances in Neural …, 2020 - proceedings.neurips.cc
We consider reinforcement learning (RL) in episodic MDPs with adversarial full-information
reward feedback and unknown fixed transition kernels. We propose two model-free policy …

Minibatch forward-backward-forward methods for solving stochastic variational inequalities

RI Boţ, P Mertikopoulos, M Staudigl… - Stochastic …, 2021 - pubsonline.informs.org
We develop a new stochastic algorithm for solving pseudomonotone stochastic variational
inequalities. Our method builds on Tseng's forward-backward-forward algorithm, which is …

Distributed no-regret learning in multiagent systems: Challenges and recent developments

X Xu, Q Zhao - IEEE Signal Processing Magazine, 2020 - ieeexplore.ieee.org
Game theory is a well-established tool for studying interactions among self-interested
players. Under the assumption of complete information on the game composition at each …

Risk-averse no-regret learning in online convex games

Z Wang, Y Shen, M Zavlanos - International Conference on …, 2022 - proceedings.mlr.press
We consider an online stochastic game with risk-averse agents whose goal is to learn
optimal decisions that minimize the risk of incurring significantly high costs. Specifically, we …

Contextual games: Multi-agent learning with side information

PG Sessa, I Bogunovic, A Krause… - Advances in Neural …, 2020 - proceedings.neurips.cc
We formulate the novel class of contextual games, a type of repeated games driven by
contextual information at each round. By means of kernel-based regularity assumptions, we …

Online non-convex optimization with imperfect feedback

A Héliou, M Martin, P Mertikopoulos… - Advances in Neural …, 2020 - proceedings.neurips.cc
We consider the problem of online learning with non-convex losses. In terms of feedback,
we assume that the learner observes–or otherwise constructs–an inexact model for the loss …

Evolutionary game theory squared: Evolving agents in endogenously evolving zero-sum games

S Skoulakis, T Fiez, R Sim, G Piliouras… - Proceedings of the AAAI …, 2021 - ojs.aaai.org
The predominant paradigm in evolutionary game theory and more generally online learning
in games is based on a clear distinction between a population of dynamic agents that …

Stochastic relaxed inertial forward-backward-forward splitting for monotone inclusions in Hilbert spaces

S Cui, U Shanbhag, M Staudigl, P Vuong - … Optimization and Applications, 2022 - Springer
We consider monotone inclusions defined on a Hilbert space where the operator is given by
the sum of a maximal monotone operator T and a single-valued monotone, Lipschitz …

Forward-backward-forward methods with variance reduction for stochastic variational inequalities

RI Bot, P Mertikopoulos, M Staudigl… - arXiv preprint arXiv …, 2019 - arxiv.org
We develop a new stochastic algorithm with variance reduction for solving pseudo-
monotone stochastic variational inequalities. Our method builds on Tseng's forward …

Online learning in periodic zero-sum games

T Fiez, R Sim, S Skoulakis… - Advances in Neural …, 2021 - proceedings.neurips.cc
A seminal result in game theory is von Neumann's minmax theorem, which states that zero-
sum games admit an essentially unique equilibrium solution. Classical learning results build …