The statistical complexity of interactive decision making
A fundamental challenge in interactive learning and decision making, ranging from bandit
problems to reinforcement learning, is to provide sample-efficient, adaptive learning …
problems to reinforcement learning, is to provide sample-efficient, adaptive learning …
[PDF][PDF] Optimal Distributed Online Prediction Using Mini-Batches.
Online prediction methods are typically presented as serial algorithms running on a single
processor. However, in the age of web-scale prediction problems, it is increasingly common …
processor. However, in the age of web-scale prediction problems, it is increasingly common …
[PDF][PDF] Optimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback.
Bandit convex optimization is a special case of online convex optimization with partial
information. In this setting, a player attempts to minimize a sequence of adversarially …
information. In this setting, a player attempts to minimize a sequence of adversarially …
[PDF][PDF] Beyond the regret minimization barrier: optimal algorithms for stochastic strongly-convex optimization
Beyond the Regret Minimization Barrier: Optimal Algorithms for Stochastic Strongly-Convex
Optimization Page 1 Journal of Machine Learning Research 15 (2014) 2489-2512 Submitted …
Optimization Page 1 Journal of Machine Learning Research 15 (2014) 2489-2512 Submitted …
[PDF][PDF] Trading regret for efficiency: online convex optimization with long term constraints
In this paper we propose efficient algorithms for solving constrained online convex
optimization problems. Our motivation stems from the observation that most algorithms …
optimization problems. Our motivation stems from the observation that most algorithms …
PAMR: Passive aggressive mean reversion strategy for portfolio selection
This article proposes a novel online portfolio selection strategy named “Passive Aggressive
Mean Reversion”(PAMR). Unlike traditional trend following approaches, the proposed …
Mean Reversion”(PAMR). Unlike traditional trend following approaches, the proposed …
Dynamic regret of strongly adaptive methods
To cope with changing environments, recent developments in online learning have
introduced the concepts of adaptive regret and dynamic regret independently. In this paper …
introduced the concepts of adaptive regret and dynamic regret independently. In this paper …
Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization
We give a novel algorithm for stochastic strongly-convex optimization in the gradient oracle
model which returns an $ O (\frac1T) $-approximate solution after $ T $ gradient updates …
model which returns an $ O (\frac1T) $-approximate solution after $ T $ gradient updates …
Sequential complexities and uniform martingale laws of large numbers
We establish necessary and sufficient conditions for a uniform martingale Law of Large
Numbers. We extend the technique of symmetrization to the case of dependent random …
Numbers. We extend the technique of symmetrization to the case of dependent random …
Unconstrained dynamic regret via sparse coding
Z Zhang, A Cutkosky… - Advances in Neural …, 2024 - proceedings.neurips.cc
Motivated by the challenge of nonstationarity in sequential decision making, we study Online
Convex Optimization (OCO) under the coupling of two problem structures: the domain is …
Convex Optimization (OCO) under the coupling of two problem structures: the domain is …