Online learning with optimism and delay

F Orabona - arXiv preprint arXiv:1912.13213, 2019 - arxiv.org

In this monograph, I introduce the basic concepts of Online Learning through a modern view
of Online Convex Optimization. Here, online learning refers to the framework of regret …

被引用次数：409 相关文章所有 3 个版本

[PDF] arxiv.org

Synthetic control as online linear regression

J Chen - Econometrica, 2023 - Wiley Online Library

This paper notes a simple connection between synthetic control and online learning.
Specifically, we recognize synthetic control as an instance of Follow‐The‐Leader (FTL) …

被引用次数：25 相关文章所有 9 个版本

[PDF] neurips.cc

Optimal rates for bandit nonstochastic control

YJ Sun, S Newman, E Hazan - Advances in Neural …, 2024 - proceedings.neurips.cc

Abstract Linear Quadratic Regulator (LQR) and Linear Quadratic Gaussian (LQG) control
are foundational and extensively researched problems in optimal control. We investigate …

被引用次数：8 相关文章所有 6 个版本

[PDF] jmlr.org

Multi-agent online optimization with delays: Asynchronicity, adaptivity, and optimism

YG Hsieh, F Iutzeler, J Malick… - Journal of Machine …, 2022 - jmlr.org

In this paper, we provide a general framework for studying multi-agent online learning
problems in the presence of delays and asynchronicities. Specifically, we propose and …

被引用次数：37 相关文章所有 15 个版本

[PDF] neurips.cc

No-regret learning in games with noisy feedback: Faster rates and adaptivity via learning rate separation

YG Hsieh, K Antonakopoulos… - Advances in …, 2022 - proceedings.neurips.cc

We examine the problem of regret minimization when the learner is involved in a continuous
game with other optimizing agents: in this case, if all players follow a no-regret algorithm, it is …

被引用次数：28 相关文章所有 36 个版本

[PDF] mlr.press

On anytime learning at macroscale

L Caccia, J Xu, M Ott, M Ranzato… - … on Lifelong Learning …, 2022 - proceedings.mlr.press

In many practical applications of machine learning data arrives sequentially over time in
large chunks. Practitioners have then to decide how to allocate their computational budget in …

被引用次数：23 相关文章所有 5 个版本

[PDF] neurips.cc

Online frank-wolfe with arbitrary delays

Y Wan, WW Tu, L Zhang - Advances in Neural Information …, 2022 - proceedings.neurips.cc

Abstract The online Frank-Wolfe (OFW) method has gained much popularity for online
convex optimization due to its projection-free property. Previous studies show that OFW can …

被引用次数：7 相关文章所有 7 个版本

[PDF] mlr.press

Nonstochastic bandits and experts with arm-dependent delays

D Van Der Hoeven… - … Conference on Artificial …, 2022 - proceedings.mlr.press

We study nonstochastic bandits and experts in a delayed setting where delays depend on
both time and arms. While the setting in which delays only depend on time has been …

被引用次数：14 相关文章所有 5 个版本

[PDF] arxiv.org

Improved Regret for Bandit Convex Optimization with Delayed Feedback

Y Wan, C Yao, M Song, L Zhang - arXiv preprint arXiv:2402.09152, 2024 - arxiv.org

We investigate bandit convex optimization (BCO) with delayed feedback, where only the
loss value of the action is revealed under an arbitrary delay. Previous studies have …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Asynchronous gradient play in zero-sum multi-agent games

R Ao, S Cen, Y Chi - arXiv preprint arXiv:2211.08980, 2022 - arxiv.org

Finding equilibria via gradient play in competitive multi-agent games has been attracting a
growing amount of attention in recent years, with emphasis on designing efficient strategies …

被引用次数：7 相关文章所有 6 个版本