A stochastic view of optimal regret through minimax duality

DJ Foster, SM Kakade, J Qian, A Rakhlin - arXiv preprint arXiv:2112.13487, 2021 - arxiv.org

A fundamental challenge in interactive learning and decision making, ranging from bandit
problems to reinforcement learning, is to provide sample-efficient, adaptive learning …

被引用次数：170 相关文章所有 6 个版本

[PDF] jmlr.org

[PDF][PDF] Optimal Distributed Online Prediction Using Mini-Batches.

O Dekel, R Gilad-Bachrach, O Shamir, L Xiao - Journal of Machine …, 2012 - jmlr.org

Online prediction methods are typically presented as serial algorithms running on a single
processor. However, in the age of web-scale prediction problems, it is increasingly common …

被引用次数：788 相关文章所有 24 个版本

[PDF] psu.edu

[PDF][PDF] Optimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback.

A Agarwal, O Dekel, L Xiao - Colt, 2010 - Citeseer

Bandit convex optimization is a special case of online convex optimization with partial
information. In this setting, a player attempts to minimize a sequence of adversarially …

被引用次数：417 相关文章所有 7 个版本

[PDF] jmlr.org

[PDF][PDF] Beyond the regret minimization barrier: optimal algorithms for stochastic strongly-convex optimization

E Hazan, S Kale - The Journal of Machine Learning Research, 2014 - jmlr.org

Beyond the Regret Minimization Barrier: Optimal Algorithms for Stochastic Strongly-Convex
Optimization Page 1 Journal of Machine Learning Research 15 (2014) 2489-2512 Submitted …

被引用次数：312 相关文章所有 8 个版本

[PDF] jmlr.org

[PDF][PDF] Trading regret for efficiency: online convex optimization with long term constraints

M Mahdavi, R Jin, T Yang - The Journal of Machine Learning Research, 2012 - jmlr.org

In this paper we propose efficient algorithms for solving constrained online convex
optimization problems. Our motivation stems from the observation that most algorithms …

被引用次数：275 相关文章所有 11 个版本

[PDF] springer.com

PAMR: Passive aggressive mean reversion strategy for portfolio selection

B Li, P Zhao, SCH Hoi, V Gopalkrishnan - Machine learning, 2012 - Springer

This article proposes a novel online portfolio selection strategy named “Passive Aggressive
Mean Reversion”(PAMR). Unlike traditional trend following approaches, the proposed …

被引用次数：224 相关文章所有 17 个版本

[PDF] mlr.press

Dynamic regret of strongly adaptive methods

L Zhang, T Yang, ZH Zhou - International conference on …, 2018 - proceedings.mlr.press

To cope with changing environments, recent developments in online learning have
introduced the concepts of adaptive regret and dynamic regret independently. In this paper …

被引用次数：115 相关文章所有 12 个版本

[PDF] mlr.press

Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization

E Hazan, S Kale - … of the 24th Annual Conference on …, 2011 - proceedings.mlr.press

We give a novel algorithm for stochastic strongly-convex optimization in the gradient oracle
model which returns an $ O (\frac1T) $-approximate solution after $ T $ gradient updates …

被引用次数：166 相关文章所有 10 个版本

[PDF] springer.com

Sequential complexities and uniform martingale laws of large numbers

A Rakhlin, K Sridharan, A Tewari - Probability theory and related fields, 2015 - Springer

We establish necessary and sufficient conditions for a uniform martingale Law of Large
Numbers. We extend the technique of symmetrization to the case of dependent random …

被引用次数：126 相关文章所有 15 个版本

[PDF] neurips.cc

Unconstrained dynamic regret via sparse coding

Z Zhang, A Cutkosky… - Advances in Neural …, 2024 - proceedings.neurips.cc

Motivated by the challenge of nonstationarity in sequential decision making, we study Online
Convex Optimization (OCO) under the coupling of two problem structures: the domain is …

被引用次数：8 相关文章所有 7 个版本