Nearly minimax algorithms for linear bandits with shared representation
We give novel algorithms for multi-task and lifelong linear bandits with shared
representation. Specifically, we consider the setting where we play $ M $ linear bandits with …
representation. Specifically, we consider the setting where we play $ M $ linear bandits with …
Optimal gradient-based algorithms for non-concave bandit optimization
Bandit problems with linear or concave reward have been extensively studied, but relatively
few works have studied bandits with non-concave reward. This work considers a large family …
few works have studied bandits with non-concave reward. This work considers a large family …
Context-lumpable stochastic bandits
We consider a contextual bandit problem with $ S $ contexts and $ K $ actions. In each
round $ t= 1, 2,\dots $ the learnerobserves a random context and chooses an action based …
round $ t= 1, 2,\dots $ the learnerobserves a random context and chooses an action based …
Multi-task representation learning for pure exploration in linear bandits
Despite the recent success of representation learning in sequential decision making, the
study of the pure exploration scenario (ie, identify the best option and minimize the sample …
study of the pure exploration scenario (ie, identify the best option and minimize the sample …
Multi-armed quantum bandits: Exploration versus exploitation when learning properties of quantum states
We initiate the study of tradeoffs between exploration and exploitation in online learning of
properties of quantum states. Given sequential oracle access to an unknown quantum state …
properties of quantum states. Given sequential oracle access to an unknown quantum state …
Gaussian imagination in bandit learning
Assuming distributions are Gaussian often facilitates computations that are otherwise
intractable. We study the performance of an agent that attains a bounded information ratio …
intractable. We study the performance of an agent that attains a bounded information ratio …
Sample complexity for quadratic bandits: hessian dependent bounds and optimal algorithms
In stochastic zeroth-order optimization, a problem of practical relevance is understanding
how to fully exploit the local geometry of the underlying objective function. We consider a …
how to fully exploit the local geometry of the underlying objective function. We consider a …
Statistical complexity and optimal algorithms for nonlinear ridge bandits
Statistical complexity and optimal algorithms for nonlinear ridge bandits Page 1 The Annals of
Statistics 2024, Vol. 52, No. 6, 2557–2582 https://doi.org/10.1214/24-AOS2395 © Institute of …
Statistics 2024, Vol. 52, No. 6, 2557–2582 https://doi.org/10.1214/24-AOS2395 © Institute of …
Optimal Sample Complexity Bounds for Non-convex Optimization under Kurdyka-Lojasiewicz Condition
Optimization of smooth reward functions under bandit feedback is a long-standing problem
in online learning. This paper approaches this problem by studying the convergence under …
in online learning. This paper approaches this problem by studying the convergence under …
Robust Gradient Descent for Phase Retrieval
A Buna, P Rebeschini - arXiv preprint arXiv:2410.10623, 2024 - arxiv.org
Recent progress in robust statistical learning has mainly tackled convex problems, like mean
estimation or linear regression, with non-convex challenges receiving less attention. Phase …
estimation or linear regression, with non-convex challenges receiving less attention. Phase …