Best arm identification with fixed budget: A large deviation perspective
We consider the problem of identifying the best arm in stochastic Multi-Armed Bandits
(MABs) using a fixed sampling budget. Characterizing the minimal instance-specific error …
(MABs) using a fixed sampling budget. Characterizing the minimal instance-specific error …
[PDF][PDF] Adaptivity and confounding in multi-armed bandit experiments
We explore a new model of bandit experiments where a potentially nonstationary sequence
of contexts influences arms' performance. Context-unaware algorithms risk confounding …
of contexts influences arms' performance. Context-unaware algorithms risk confounding …
On the existence of a complexity in fixed budget bandit identification
R Degenne - The Thirty Sixth Annual Conference on …, 2023 - proceedings.mlr.press
In fixed budget bandit identification, an algorithm sequentially observes samples from
several distributions up to a given final time. It then answers a query about the set of …
several distributions up to a given final time. It then answers a query about the set of …
Non-asymptotic analysis of a ucb-based top two algorithm
A Top Two sampling rule for bandit identification is a method which selects the next arm to
sample from among two candidate arms, a leader and a challenger. Due to their simplicity …
sample from among two candidate arms, a leader and a challenger. Due to their simplicity …
On uniformly optimal algorithms for best arm identification in two-armed bandits with fixed budget
We study the problem of best-arm identification with fixed budget in stochastic two-arm
bandits with Bernoulli rewards. We prove that surprisingly, there is no algorithm that (i) …
bandits with Bernoulli rewards. We prove that surprisingly, there is no algorithm that (i) …
Bandit algorithms for policy learning: methods, implementation, and welfare-performance
T Kitagawa, J Rowley - The Japanese Economic Review, 2024 - Springer
Static supervised learning—in which experimental data serves as a training sample for the
estimation of an optimal treatment assignment policy—is a commonly assumed framework of …
estimation of an optimal treatment assignment policy—is a commonly assumed framework of …
Locally Optimal Fixed-Budget Best Arm Identification in Two-Armed Gaussian Bandits with Unknown Variances
M Kato - arXiv preprint arXiv:2312.12741, 2023 - arxiv.org
We address the problem of best arm identification (BAI) with a fixed budget for two-armed
Gaussian bandits. In BAI, given multiple arms, we aim to find the best arm, an arm with the …
Gaussian bandits. In BAI, given multiple arms, we aim to find the best arm, an arm with the …
Optimizing Adaptive Experiments: A Unified Approach to Regret Minimization and Best-Arm Identification
Practitioners conducting adaptive experiments often encounter two competing priorities:
reducing the cost of experimentation by effectively assigning treatments during the …
reducing the cost of experimentation by effectively assigning treatments during the …
Dynamic Targeting: Experimental Evidence from Energy Rebate Programs
Economic policies often involve dynamic interventions, where individuals receive repeated
interventions over multiple periods. This dynamics makes past responses informative to …
interventions over multiple periods. This dynamics makes past responses informative to …
Best arm identification with contextual information under a small gap
We study the best-arm identification (BAI) problem with a fixed budget and contextual
(covariate) information. In each round of an adaptive experiment, after observing contextual …
(covariate) information. In each round of an adaptive experiment, after observing contextual …