Optimal streaming algorithms for multi-armed bandits
This paper studies two variants of the best arm identification (BAI) problem under the
streaming model, where we have a stream of n arms with reward distributions supported on …
streaming model, where we have a stream of n arms with reward distributions supported on …
Collaborative top distribution identifications with limited interaction
We consider the following problem in this paper: given a set of n distributions, find the top-m
ones with the largest means. This problem is also called top-m arm identifications in the …
ones with the largest means. This problem is also called top-m arm identifications in the …
The role of interactivity in structured estimation
We study high-dimensional sparse estimation under three natural constraints:
communication constraints, local privacy constraints, and linear measurements …
communication constraints, local privacy constraints, and linear measurements …
Double explore-then-commit: Asymptotic optimality and beyond
We study the multi-armed bandit problem with subGaussian rewards. The explore-then-
commit (ETC) strategy, which consists of an exploration phase followed by an exploitation …
commit (ETC) strategy, which consists of an exploration phase followed by an exploitation …
Optimal batched best arm identification
We study the batched best arm identification (BBAI) problem, where the learner's goal is to
identify the best arm while switching the policy as less as possible. In particular, we aim to …
identify the best arm while switching the policy as less as possible. In particular, we aim to …
Efficient and robust sequential decision making algorithms
P Xu - AI Magazine, 2024 - Wiley Online Library
Sequential decision‐making involves making informed decisions based on continuous
interactions with a complex environment. This process is ubiquitous in various applications …
interactions with a complex environment. This process is ubiquitous in various applications …
Collaborative best arm identification with limited communication on non-IID data
In this paper, we study the tradeoffs between the time speedup and the round complexity in
the collaborative learning model with non-IID data, where multiple agents interact with …
the collaborative learning model with non-IID data, where multiple agents interact with …
An optimal elimination algorithm for learning a best arm
A Hassidim, R Kupfer, Y Singer - Advances in Neural …, 2020 - proceedings.neurips.cc
We consider the classic problem of $(\epsilon,\delta) $-\texttt {PAC} learning a best arm
where the goal is to identify with confidence $1-\delta $ an arm whose mean is an $\epsilon …
where the goal is to identify with confidence $1-\delta $ an arm whose mean is an $\epsilon …
Approximate Top- Arm Identification with Heterogeneous Reward Variances
We study the effect of reward variance heterogeneity in the approximate top-$ m $ arm
identification setting. In this setting, the reward for the $ i $-th arm follows a $\sigma^ 2_i …
identification setting. In this setting, the reward for the $ i $-th arm follows a $\sigma^ 2_i …
[PDF][PDF] Batched coarse ranking in multi-armed bandits
We study the problem of coarse ranking in the multi-armed bandits (MAB) setting, where we
have a set of arms each of which is associated with an unknown distribution. The task is to …
have a set of arms each of which is associated with an unknown distribution. The task is to …