Dynamic regret of policy optimization in non-stationary environments
We consider reinforcement learning (RL) in episodic MDPs with adversarial full-information
reward feedback and unknown fixed transition kernels. We propose two model-free policy …
reward feedback and unknown fixed transition kernels. We propose two model-free policy …
Minibatch forward-backward-forward methods for solving stochastic variational inequalities
We develop a new stochastic algorithm for solving pseudomonotone stochastic variational
inequalities. Our method builds on Tseng's forward-backward-forward algorithm, which is …
inequalities. Our method builds on Tseng's forward-backward-forward algorithm, which is …
Distributed no-regret learning in multiagent systems: Challenges and recent developments
Game theory is a well-established tool for studying interactions among self-interested
players. Under the assumption of complete information on the game composition at each …
players. Under the assumption of complete information on the game composition at each …
Risk-averse no-regret learning in online convex games
We consider an online stochastic game with risk-averse agents whose goal is to learn
optimal decisions that minimize the risk of incurring significantly high costs. Specifically, we …
optimal decisions that minimize the risk of incurring significantly high costs. Specifically, we …
Contextual games: Multi-agent learning with side information
We formulate the novel class of contextual games, a type of repeated games driven by
contextual information at each round. By means of kernel-based regularity assumptions, we …
contextual information at each round. By means of kernel-based regularity assumptions, we …
Online non-convex optimization with imperfect feedback
We consider the problem of online learning with non-convex losses. In terms of feedback,
we assume that the learner observes–or otherwise constructs–an inexact model for the loss …
we assume that the learner observes–or otherwise constructs–an inexact model for the loss …
Evolutionary game theory squared: Evolving agents in endogenously evolving zero-sum games
The predominant paradigm in evolutionary game theory and more generally online learning
in games is based on a clear distinction between a population of dynamic agents that …
in games is based on a clear distinction between a population of dynamic agents that …
Stochastic relaxed inertial forward-backward-forward splitting for monotone inclusions in Hilbert spaces
We consider monotone inclusions defined on a Hilbert space where the operator is given by
the sum of a maximal monotone operator T and a single-valued monotone, Lipschitz …
the sum of a maximal monotone operator T and a single-valued monotone, Lipschitz …
Forward-backward-forward methods with variance reduction for stochastic variational inequalities
We develop a new stochastic algorithm with variance reduction for solving pseudo-
monotone stochastic variational inequalities. Our method builds on Tseng's forward …
monotone stochastic variational inequalities. Our method builds on Tseng's forward …
Online learning in periodic zero-sum games
A seminal result in game theory is von Neumann's minmax theorem, which states that zero-
sum games admit an essentially unique equilibrium solution. Classical learning results build …
sum games admit an essentially unique equilibrium solution. Classical learning results build …