The statistical complexity of interactive decision making

DJ Foster, SM Kakade, J Qian, A Rakhlin - arXiv preprint arXiv:2112.13487, 2021 - arxiv.org
A fundamental challenge in interactive learning and decision making, ranging from bandit
problems to reinforcement learning, is to provide sample-efficient, adaptive learning …

[PDF][PDF] Optimal Distributed Online Prediction Using Mini-Batches.

O Dekel, R Gilad-Bachrach, O Shamir, L Xiao - Journal of Machine …, 2012 - jmlr.org
Online prediction methods are typically presented as serial algorithms running on a single
processor. However, in the age of web-scale prediction problems, it is increasingly common …

[PDF][PDF] Optimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback.

A Agarwal, O Dekel, L Xiao - Colt, 2010 - Citeseer
Bandit convex optimization is a special case of online convex optimization with partial
information. In this setting, a player attempts to minimize a sequence of adversarially …

[PDF][PDF] Beyond the regret minimization barrier: optimal algorithms for stochastic strongly-convex optimization

E Hazan, S Kale - The Journal of Machine Learning Research, 2014 - jmlr.org
Beyond the Regret Minimization Barrier: Optimal Algorithms for Stochastic Strongly-Convex
Optimization Page 1 Journal of Machine Learning Research 15 (2014) 2489-2512 Submitted …

[PDF][PDF] Trading regret for efficiency: online convex optimization with long term constraints

M Mahdavi, R Jin, T Yang - The Journal of Machine Learning Research, 2012 - jmlr.org
In this paper we propose efficient algorithms for solving constrained online convex
optimization problems. Our motivation stems from the observation that most algorithms …

PAMR: Passive aggressive mean reversion strategy for portfolio selection

B Li, P Zhao, SCH Hoi, V Gopalkrishnan - Machine learning, 2012 - Springer
This article proposes a novel online portfolio selection strategy named “Passive Aggressive
Mean Reversion”(PAMR). Unlike traditional trend following approaches, the proposed …

Dynamic regret of strongly adaptive methods

L Zhang, T Yang, ZH Zhou - International conference on …, 2018 - proceedings.mlr.press
To cope with changing environments, recent developments in online learning have
introduced the concepts of adaptive regret and dynamic regret independently. In this paper …

Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization

E Hazan, S Kale - … of the 24th Annual Conference on …, 2011 - proceedings.mlr.press
We give a novel algorithm for stochastic strongly-convex optimization in the gradient oracle
model which returns an $ O (\frac1T) $-approximate solution after $ T $ gradient updates …

Sequential complexities and uniform martingale laws of large numbers

A Rakhlin, K Sridharan, A Tewari - Probability theory and related fields, 2015 - Springer
We establish necessary and sufficient conditions for a uniform martingale Law of Large
Numbers. We extend the technique of symmetrization to the case of dependent random …

Unconstrained dynamic regret via sparse coding

Z Zhang, A Cutkosky… - Advances in Neural …, 2024 - proceedings.neurips.cc
Motivated by the challenge of nonstationarity in sequential decision making, we study Online
Convex Optimization (OCO) under the coupling of two problem structures: the domain is …