Best arm identification with fixed budget: A large deviation perspective

PA Wang, RC Tzeng… - Advances in Neural …, 2024 - proceedings.neurips.cc
We consider the problem of identifying the best arm in stochastic Multi-Armed Bandits
(MABs) using a fixed sampling budget. Characterizing the minimal instance-specific error …

[PDF][PDF] Adaptivity and confounding in multi-armed bandit experiments

C Qin, D Russo - arXiv preprint arXiv:2202.09036, 2022 - aeaweb.org
We explore a new model of bandit experiments where a potentially nonstationary sequence
of contexts influences arms' performance. Context-unaware algorithms risk confounding …

On the existence of a complexity in fixed budget bandit identification

R Degenne - The Thirty Sixth Annual Conference on …, 2023 - proceedings.mlr.press
In fixed budget bandit identification, an algorithm sequentially observes samples from
several distributions up to a given final time. It then answers a query about the set of …

Non-asymptotic analysis of a ucb-based top two algorithm

M Jourdan, R Degenne - Advances in Neural Information …, 2024 - proceedings.neurips.cc
A Top Two sampling rule for bandit identification is a method which selects the next arm to
sample from among two candidate arms, a leader and a challenger. Due to their simplicity …

On uniformly optimal algorithms for best arm identification in two-armed bandits with fixed budget

PA Wang, K Ariu, A Proutiere - arXiv preprint arXiv:2308.12000, 2023 - arxiv.org
We study the problem of best-arm identification with fixed budget in stochastic two-arm
bandits with Bernoulli rewards. We prove that surprisingly, there is no algorithm that (i) …

Bandit algorithms for policy learning: methods, implementation, and welfare-performance

T Kitagawa, J Rowley - The Japanese Economic Review, 2024 - Springer
Static supervised learning—in which experimental data serves as a training sample for the
estimation of an optimal treatment assignment policy—is a commonly assumed framework of …

Locally Optimal Fixed-Budget Best Arm Identification in Two-Armed Gaussian Bandits with Unknown Variances

M Kato - arXiv preprint arXiv:2312.12741, 2023 - arxiv.org
We address the problem of best arm identification (BAI) with a fixed budget for two-armed
Gaussian bandits. In BAI, given multiple arms, we aim to find the best arm, an arm with the …

Optimizing Adaptive Experiments: A Unified Approach to Regret Minimization and Best-Arm Identification

C Qin, D Russo - arXiv preprint arXiv:2402.10592, 2024 - arxiv.org
Practitioners conducting adaptive experiments often encounter two competing priorities:
reducing the cost of experimentation by effectively assigning treatments during the …

Dynamic Targeting: Experimental Evidence from Energy Rebate Programs

T Ida, T Ishihara, K Ito, D Kido, T Kitagawa… - 2024 - nber.org
Economic policies often involve dynamic interventions, where individuals receive repeated
interventions over multiple periods. This dynamics makes past responses informative to …

Best arm identification with contextual information under a small gap

M Kato, M Imaizumi, T Ishihara, T Kitagawa - arXiv preprint arXiv …, 2022 - arxiv.org
We study the best-arm identification (BAI) problem with a fixed budget and contextual
(covariate) information. In each round of an adaptive experiment, after observing contextual …