[图书][B] Bandit algorithms
T Lattimore, C Szepesvári - 2020 - books.google.com
Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …
and the multi-armed bandit model is a commonly used framework to address it. This …
Hyperband: A novel bandit-based approach to hyperparameter optimization
Performance of machine learning algorithms depends critically on identifying a good set of
hyperparameters. While recent approaches use Bayesian optimization to adaptively select …
hyperparameters. While recent approaches use Bayesian optimization to adaptively select …
Time-uniform, nonparametric, nonasymptotic confidence sequences
Time-uniform, nonparametric, nonasymptotic confidence sequences Page 1 The Annals of
Statistics 2021, Vol. 49, No. 2, 1055–1080 https://doi.org/10.1214/20-AOS1991 © Institute of …
Statistics 2021, Vol. 49, No. 2, 1055–1080 https://doi.org/10.1214/20-AOS1991 © Institute of …
Simple bayesian algorithms for best arm identification
D Russo - Conference on Learning Theory, 2016 - proceedings.mlr.press
This paper considers the optimal adaptive allocation of measurement effort for identifying the
best among a finite set of options or designs. An experimenter sequentially chooses designs …
best among a finite set of options or designs. An experimenter sequentially chooses designs …
Top two algorithms revisited
Top two algorithms arose as an adaptation of Thompson sampling to best arm identification
in multi-armed bandit models for parametric families of arms. They select the next arm to …
in multi-armed bandit models for parametric families of arms. They select the next arm to …
Combinatorial pure exploration of multi-armed bandits
We study the {\em combinatorial pure exploration (CPE)} problem in the stochastic multi-
armed bandit setting, where a learner explores a set of arms with the objective of identifying …
armed bandit setting, where a learner explores a set of arms with the objective of identifying …
Improving the expected improvement algorithm
The expected improvement (EI) algorithm is a popular strategy for information collection in
optimization under uncertainty. The algorithm is widely known to be too greedy, but …
optimization under uncertainty. The algorithm is widely known to be too greedy, but …
Inference for batched bandits
As bandit algorithms are increasingly utilized in scientific studies and industrial applications,
there is an associated increasing need for reliable inference methods based on the resulting …
there is an associated increasing need for reliable inference methods based on the resulting …
Social learning in multi agent multi armed bandits
A Sankararaman, A Ganesh, S Shakkottai - Proceedings of the ACM on …, 2019 - dl.acm.org
Motivated by emerging need of learning algorithms for large scale networked and
decentralized systems, we introduce a distributed version of the classical stochastic Multi …
decentralized systems, we introduce a distributed version of the classical stochastic Multi …
Tight (lower) bounds for the fixed budget best arm identification bandit problem
A Carpentier, A Locatelli - Conference on Learning Theory, 2016 - proceedings.mlr.press
We consider the problem of\textitbest arm identification with a\textitfixed budget T, in the K-
armed stochastic bandit setting, with arms distribution defined on [0, 1]. We prove that any …
armed stochastic bandit setting, with arms distribution defined on [0, 1]. We prove that any …