Best-arm identification algorithms for multi-armed bandits in the fixed confidence setting

T Lattimore, C Szepesvári - 2020 - books.google.com

Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

被引用次数：2922 相关文章所有 9 个版本

[PDF] jmlr.org

Hyperband: A novel bandit-based approach to hyperparameter optimization

L Li, K Jamieson, G DeSalvo, A Rostamizadeh… - Journal of Machine …, 2018 - jmlr.org

Performance of machine learning algorithms depends critically on identifying a good set of
hyperparameters. While recent approaches use Bayesian optimization to adaptively select …

被引用次数：2820 相关文章所有 13 个版本

[PDF] projecteuclid.org

Time-uniform, nonparametric, nonasymptotic confidence sequences

SR Howard, A Ramdas, J McAuliffe, J Sekhon - 2021 - projecteuclid.org

Time-uniform, nonparametric, nonasymptotic confidence sequences Page 1 The Annals of
Statistics 2021, Vol. 49, No. 2, 1055–1080 https://doi.org/10.1214/20-AOS1991 © Institute of …

被引用次数：277 相关文章所有 7 个版本

[PDF] mlr.press

Simple bayesian algorithms for best arm identification

D Russo - Conference on Learning Theory, 2016 - proceedings.mlr.press

This paper considers the optimal adaptive allocation of measurement effort for identifying the
best among a finite set of options or designs. An experimenter sequentially chooses designs …

被引用次数：309 相关文章所有 10 个版本

[PDF] neurips.cc

Top two algorithms revisited

M Jourdan, R Degenne, D Baudry… - Advances in …, 2022 - proceedings.neurips.cc

Top two algorithms arose as an adaptation of Thompson sampling to best arm identification
in multi-armed bandit models for parametric families of arms. They select the next arm to …

被引用次数：36 相关文章所有 12 个版本

[PDF] neurips.cc

Combinatorial pure exploration of multi-armed bandits

S Chen, T Lin, I King, MR Lyu… - Advances in neural …, 2014 - proceedings.neurips.cc

We study the {\em combinatorial pure exploration (CPE)} problem in the stochastic multi-
armed bandit setting, where a learner explores a set of arms with the objective of identifying …

被引用次数：239 相关文章所有 11 个版本

[PDF] neurips.cc

Improving the expected improvement algorithm

C Qin, D Klabjan, D Russo - Advances in Neural …, 2017 - proceedings.neurips.cc

The expected improvement (EI) algorithm is a popular strategy for information collection in
optimization under uncertainty. The algorithm is widely known to be too greedy, but …

被引用次数：153 相关文章所有 7 个版本

[PDF] neurips.cc

Inference for batched bandits

K Zhang, L Janson, S Murphy - Advances in neural …, 2020 - proceedings.neurips.cc

As bandit algorithms are increasingly utilized in scientific studies and industrial applications,
there is an associated increasing need for reliable inference methods based on the resulting …

被引用次数：93 相关文章所有 11 个版本

[PDF] acm.org

Social learning in multi agent multi armed bandits

A Sankararaman, A Ganesh, S Shakkottai - Proceedings of the ACM on …, 2019 - dl.acm.org

Motivated by emerging need of learning algorithms for large scale networked and
decentralized systems, we introduce a distributed version of the classical stochastic Multi …

被引用次数：93 相关文章所有 10 个版本

[PDF] mlr.press

Tight (lower) bounds for the fixed budget best arm identification bandit problem

A Carpentier, A Locatelli - Conference on Learning Theory, 2016 - proceedings.mlr.press

We consider the problem of\textitbest arm identification with a\textitfixed budget T, in the K-
armed stochastic bandit setting, with arms distribution defined on [0, 1]. We prove that any …

被引用次数：147 相关文章所有 7 个版本