Beyond ucb: Optimal and efficient contextual bandits with regression oracles

D Foster, A Rakhlin - International Conference on Machine …, 2020 - proceedings.mlr.press
A fundamental challenge in contextual bandits is to develop flexible, general-purpose
algorithms with computational requirements no worse than classical supervised learning …

Adapting to misspecification in contextual bandits

DJ Foster, C Gentile, M Mohri… - Advances in Neural …, 2020 - proceedings.neurips.cc
A major research direction in contextual bandits is to develop algorithms that are
computationally efficient, yet support flexible, general-purpose function approximation …

Smoothed online learning is as easy as statistical learning

A Block, Y Dagan, N Golowich… - … on Learning Theory, 2022 - proceedings.mlr.press
Much of modern learning theory has been split between two regimes: the classical offline
setting, where data arrive independently, and the online setting, where data arrive …

Optimal dynamic regret in exp-concave online learning

D Baby, YX Wang - Conference on Learning Theory, 2021 - proceedings.mlr.press
We consider the problem of the Zinkevich (2003)-style dynamic regret minimization in online
learning with\emph {exp-concave} losses. We show that whenever improper learning is …

Optimal dynamic regret in proper online learning with strongly convex losses and beyond

D Baby, YX Wang - International Conference on Artificial …, 2022 - proceedings.mlr.press
We study the framework of universal dynamic regret minimization with strongly convex
losses. We answer an open problem in Baby and Wang 2021 by showing that in a proper …

Unconstrained dynamic regret via sparse coding

Z Zhang, A Cutkosky… - Advances in Neural …, 2024 - proceedings.neurips.cc
Motivated by the challenge of nonstationarity in sequential decision making, we study Online
Convex Optimization (OCO) under the coupling of two problem structures: the domain is …

Contextual bandits with smooth regret: Efficient learning in continuous action spaces

Y Zhu, P Mineiro - International Conference on Machine …, 2022 - proceedings.mlr.press
Designing efficient general-purpose contextual bandit algorithms that work with large—or
even infinite—action spaces would facilitate application to important scenarios such as …

Learning to bid optimally and efficiently in adversarial first-price auctions

Y Han, Z Zhou, A Flores, E Ordentlich… - arXiv preprint arXiv …, 2020 - arxiv.org
First-price auctions have very recently swept the online advertising industry, replacing
second-price auctions as the predominant auction mechanism on many platforms. This shift …

Chaining meets chain rule: Multilevel entropic regularization and training of neural networks

AR Asadi, E Abbe - Journal of Machine Learning Research, 2020 - jmlr.org
We derive generalization and excess risk bounds for neural networks using a family of
complexity measures based on a multilevel relative entropy. The bounds are obtained by …

Online label shift: Optimal dynamic regret meets practical algorithms

D Baby, S Garg, TC Yen… - Advances in …, 2024 - proceedings.neurips.cc
This paper focuses on supervised and unsupervised online label shift, where the class
marginals $ Q (y) $ variesbut the class-conditionals $ Q (x| y) $ remain invariant. In the …