Bandit phase retrieval

J Yang, Q Lei, JD Lee, SS Du - arXiv preprint arXiv:2203.15664, 2022 - arxiv.org

We give novel algorithms for multi-task and lifelong linear bandits with shared
representation. Specifically, we consider the setting where we play $ M $ linear bandits with …

被引用次数：14 相关文章所有 3 个版本

[PDF] neurips.cc

Optimal gradient-based algorithms for non-concave bandit optimization

B Huang, K Huang, S Kakade, JD Lee… - Advances in …, 2021 - proceedings.neurips.cc

Bandit problems with linear or concave reward have been extensively studied, but relatively
few works have studied bandits with non-concave reward. This work considers a large family …

被引用次数：17 相关文章所有 11 个版本

[PDF] neurips.cc

Context-lumpable stochastic bandits

CW Lee, Q Liu, Y Abbasi Yadkori… - Advances in …, 2024 - proceedings.neurips.cc

We consider a contextual bandit problem with $ S $ contexts and $ K $ actions. In each
round $ t= 1, 2,\dots $ the learnerobserves a random context and chooses an action based …

被引用次数：2 相关文章所有 7 个版本

[PDF] mlr.press

Multi-task representation learning for pure exploration in linear bandits

Y Du, L Huang, W Sun - International Conference on …, 2023 - proceedings.mlr.press

Despite the recent success of representation learning in sequential decision making, the
study of the pure exploration scenario (ie, identify the best option and minimize the sample …

被引用次数：5 相关文章所有 7 个版本

[PDF] quantum-journal.org

Multi-armed quantum bandits: Exploration versus exploitation when learning properties of quantum states

J Lumbreras, E Haapasalo, M Tomamichel - Quantum, 2022 - quantum-journal.org

We initiate the study of tradeoffs between exploration and exploitation in online learning of
properties of quantum states. Given sequential oracle access to an unknown quantum state …

被引用次数：18 相关文章所有 9 个版本

[PDF] arxiv.org

Gaussian imagination in bandit learning

Y Liu, AM Devraj, B Van Roy, K Xu - arXiv preprint arXiv:2201.01902, 2022 - arxiv.org

Assuming distributions are Gaussian often facilitates computations that are otherwise
intractable. We study the performance of an agent that attains a bounded information ratio …

被引用次数：8 相关文章所有 2 个版本

[PDF] neurips.cc

Sample complexity for quadratic bandits: hessian dependent bounds and optimal algorithms

Q Yu, Y Wang, B Huang, Q Lei… - Advances in Neural …, 2023 - proceedings.neurips.cc

In stochastic zeroth-order optimization, a problem of practical relevance is understanding
how to fully exploit the local geometry of the underlying objective function. We consider a …

被引用次数：1 相关文章所有 7 个版本

[PDF] arxiv.org

Statistical complexity and optimal algorithms for nonlinear ridge bandits

N Rajaraman, Y Han, J Jiao… - The Annals of …, 2024 - projecteuclid.org

Statistical complexity and optimal algorithms for nonlinear ridge bandits Page 1 The Annals of
Statistics 2024, Vol. 52, No. 6, 2557–2582 https://doi.org/10.1214/24-AOS2395 © Institute of …

被引用次数：1 相关文章所有 2 个版本

[PDF] mlr.press

Optimal Sample Complexity Bounds for Non-convex Optimization under Kurdyka-Lojasiewicz Condition

Q Yu, Y Wang, B Huang, Q Lei… - … Conference on Artificial …, 2023 - proceedings.mlr.press

Optimization of smooth reward functions under bandit feedback is a long-standing problem
in online learning. This paper approaches this problem by studying the convergence under …

被引用次数：2 相关文章所有 4 个版本

[PDF] arxiv.org

Robust Gradient Descent for Phase Retrieval

A Buna, P Rebeschini - arXiv preprint arXiv:2410.10623, 2024 - arxiv.org

Recent progress in robust statistical learning has mainly tackled convex problems, like mean
estimation or linear regression, with non-convex challenges receiving less attention. Phase …

Nearly minimax algorithms for linear bandits with shared representation

Optimal gradient-based algorithms for non-concave bandit optimization

Context-lumpable stochastic bandits

Multi-task representation learning for pure exploration in linear bandits

Multi-armed quantum bandits: Exploration versus exploitation when learning properties of quantum states

Gaussian imagination in bandit learning

Sample complexity for quadratic bandits: hessian dependent bounds and optimal algorithms

Statistical complexity and optimal algorithms for nonlinear ridge bandits

Optimal Sample Complexity Bounds for Non-convex Optimization under Kurdyka-Lojasiewicz Condition

Robust Gradient Descent for Phase Retrieval

高级搜索

引用