Best arm identification with fixed budget: A large deviation perspective

PA Wang, RC Tzeng… - Advances in Neural …, 2024 - proceedings.neurips.cc
We consider the problem of identifying the best arm in stochastic Multi-Armed Bandits
(MABs) using a fixed sampling budget. Characterizing the minimal instance-specific error …

On the existence of a complexity in fixed budget bandit identification

R Degenne - The Thirty Sixth Annual Conference on …, 2023 - proceedings.mlr.press
In fixed budget bandit identification, an algorithm sequentially observes samples from
several distributions up to a given final time. It then answers a query about the set of …

Best arm identification for prompt learning under a limited budget

C Shi, K Yang, J Yang, C Shen - arXiv preprint arXiv:2402.09723, 2024 - openreview.net
The remarkable instruction-following capability of large language models (LLMs) has
sparked a growing interest in automatically learning suitable prompts. However, while many …

Experimental designs for heteroskedastic variance

J Weltz, T Fiez, A Volfovsky, E Laber… - Advances in …, 2024 - proceedings.neurips.cc
Most linear experimental design problems assume homogeneous variance, while the
presence of heteroskedastic noise is present in many realistic settings. Let a learner have …

Fixed-Budget Real-Valued Combinatorial Pure Exploration of Multi-Armed Bandit

S Nakamura, M Sugiyama - International Conference on …, 2024 - proceedings.mlr.press
We study the real-valued combinatorial pure exploration of the multi-armed bandit in the
fixed-budget setting. We first introduce an algorithm named the Combinatorial Successive …

A/B Testing and Best-arm Identification for Linear Bandits with Robustness to Non-stationarity

Z Xiong, R Camilleri, M Fazel, L Jain… - International …, 2024 - proceedings.mlr.press
We investigate the fixed-budget best-arm identification (BAI) problem for linear bandits in a
potentially non-stationary environment. Given a finite arm set $\mathcal {X}\subset\mathbb …

[PDF][PDF] Optimal clustering with bandit feedback

J Yang, Z Zhong, VYF Tan - Journal of Machine Learning Research, 2024 - jmlr.org
This paper considers the problem of online clustering with bandit feedback. A set of arms (or
items) can be partitioned into various groups that are unknown. Within each group, the …

Efficient prompt optimization through the lens of best arm identification

C Shi, K Yang, Z Chen, J Li, J Yang… - The Thirty-eighth Annual …, 2024 - openreview.net
The remarkable instruction-following capability of large language models (LLMs) has
sparked a growing interest in automatically finding good prompts, ie, prompt optimization …

Model-Based Best Arm Identification for Decreasing Bandits

S Takemori, Y Umeda… - … Conference on Artificial …, 2024 - proceedings.mlr.press
We study the problem of reliably identifying the best (lowest loss) arm in a stochastic multi-
armed bandit when the expected loss of each arm is monotone decreasing as a function of …

Prior-Dependent Allocations for Bayesian Fixed-Budget Best-Arm Identification in Structured Bandits

N Nguyen, I Aouali, A György, C Vernade - arXiv preprint arXiv …, 2024 - arxiv.org
We study the problem of Bayesian fixed-budget best-arm identification (BAI) in structured
bandits. We propose an algorithm that uses fixed allocations based on the prior information …