A modern introduction to online learning

F Orabona - arXiv preprint arXiv:1912.13213, 2019 - arxiv.org
In this monograph, I introduce the basic concepts of Online Learning through a modern view
of Online Convex Optimization. Here, online learning refers to the framework of regret …

Synthetic control as online linear regression

J Chen - Econometrica, 2023 - Wiley Online Library
This paper notes a simple connection between synthetic control and online learning.
Specifically, we recognize synthetic control as an instance of Follow‐The‐Leader (FTL) …

Optimal rates for bandit nonstochastic control

YJ Sun, S Newman, E Hazan - Advances in Neural …, 2024 - proceedings.neurips.cc
Abstract Linear Quadratic Regulator (LQR) and Linear Quadratic Gaussian (LQG) control
are foundational and extensively researched problems in optimal control. We investigate …

Multi-agent online optimization with delays: Asynchronicity, adaptivity, and optimism

YG Hsieh, F Iutzeler, J Malick… - Journal of Machine …, 2022 - jmlr.org
In this paper, we provide a general framework for studying multi-agent online learning
problems in the presence of delays and asynchronicities. Specifically, we propose and …

No-regret learning in games with noisy feedback: Faster rates and adaptivity via learning rate separation

YG Hsieh, K Antonakopoulos… - Advances in …, 2022 - proceedings.neurips.cc
We examine the problem of regret minimization when the learner is involved in a continuous
game with other optimizing agents: in this case, if all players follow a no-regret algorithm, it is …

On anytime learning at macroscale

L Caccia, J Xu, M Ott, M Ranzato… - … on Lifelong Learning …, 2022 - proceedings.mlr.press
In many practical applications of machine learning data arrives sequentially over time in
large chunks. Practitioners have then to decide how to allocate their computational budget in …

Online frank-wolfe with arbitrary delays

Y Wan, WW Tu, L Zhang - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Abstract The online Frank-Wolfe (OFW) method has gained much popularity for online
convex optimization due to its projection-free property. Previous studies show that OFW can …

Nonstochastic bandits and experts with arm-dependent delays

D Van Der Hoeven… - … Conference on Artificial …, 2022 - proceedings.mlr.press
We study nonstochastic bandits and experts in a delayed setting where delays depend on
both time and arms. While the setting in which delays only depend on time has been …

Improved Regret for Bandit Convex Optimization with Delayed Feedback

Y Wan, C Yao, M Song, L Zhang - arXiv preprint arXiv:2402.09152, 2024 - arxiv.org
We investigate bandit convex optimization (BCO) with delayed feedback, where only the
loss value of the action is revealed under an arbitrary delay. Previous studies have …

Asynchronous gradient play in zero-sum multi-agent games

R Ao, S Cen, Y Chi - arXiv preprint arXiv:2211.08980, 2022 - arxiv.org
Finding equilibria via gradient play in competitive multi-agent games has been attracting a
growing amount of attention in recent years, with emphasis on designing efficient strategies …