- 学术资源搜索

Optimal streaming algorithms for multi-armed bandits

T Jin, K Huang, J Tang, X Xiao - International Conference on …, 2021 - proceedings.mlr.press

This paper studies two variants of the best arm identification (BAI) problem under the
streaming model, where we have a stream of n arms with reward distributions supported on …

被引用次数：21 相关文章所有 4 个版本

[PDF] arxiv.org

Collaborative top distribution identifications with limited interaction

N Karpov, Q Zhang, Y Zhou - 2020 IEEE 61st Annual …, 2020 - ieeexplore.ieee.org

We consider the following problem in this paper: given a set of n distributions, find the top-m
ones with the largest means. This problem is also called top-m arm identifications in the …

被引用次数：25 相关文章所有 8 个版本

[PDF] mlr.press

The role of interactivity in structured estimation

J Acharya, CL Canonne, Z Sun… - … on Learning Theory, 2022 - proceedings.mlr.press

We study high-dimensional sparse estimation under three natural constraints:
communication constraints, local privacy constraints, and linear measurements …

被引用次数：15 相关文章所有 6 个版本

[PDF] mlr.press

Double explore-then-commit: Asymptotic optimality and beyond

T Jin, P Xu, X Xiao, Q Gu - Conference on Learning Theory, 2021 - proceedings.mlr.press

We study the multi-armed bandit problem with subGaussian rewards. The explore-then-
commit (ETC) strategy, which consists of an exploration phase followed by an exploitation …

被引用次数：25 相关文章所有 7 个版本

[PDF] arxiv.org

Optimal batched best arm identification

T Jin, Y Yang, J Tang, X Xiao, P Xu - arXiv preprint arXiv:2310.14129, 2023 - arxiv.org

We study the batched best arm identification (BBAI) problem, where the learner's goal is to
identify the best arm while switching the policy as less as possible. In particular, we aim to …

被引用次数：3 相关文章所有 2 个版本

[PDF] wiley.com Full View

Efficient and robust sequential decision making algorithms

P Xu - AI Magazine, 2024 - Wiley Online Library

Sequential decision‐making involves making informed decisions based on continuous
interactions with a complex environment. This process is ubiquitous in various applications …

Collaborative best arm identification with limited communication on non-IID data

N Karpov, Q Zhang - arXiv preprint arXiv:2207.08015, 2022 - arxiv.org

In this paper, we study the tradeoffs between the time speedup and the round complexity in
the collaborative learning model with non-IID data, where multiple agents interact with …

被引用次数：7 相关文章所有 2 个版本

[PDF] neurips.cc

An optimal elimination algorithm for learning a best arm

A Hassidim, R Kupfer, Y Singer - Advances in Neural …, 2020 - proceedings.neurips.cc

We consider the classic problem of $(\epsilon,\delta) $-\texttt {PAC} learning a best arm
where the goal is to identify with confidence $1-\delta $ an arm whose mean is an $\epsilon …

被引用次数：14 相关文章所有 5 个版本

[PDF] mlr.press

Approximate Top- Arm Identification with Heterogeneous Reward Variances

R Zhou, C Tian - International Conference on Artificial …, 2022 - proceedings.mlr.press

We study the effect of reward variance heterogeneity in the approximate top-$ m $ arm
identification setting. In this setting, the reward for the $ i $-th arm follows a $\sigma^ 2_i …

被引用次数：5 相关文章所有 4 个版本

[PDF] nsf.gov

[PDF][PDF] Batched coarse ranking in multi-armed bandits

N Karpov, Q Zhang - Conference on Neural Information Processing …, 2020 - par.nsf.gov

We study the problem of coarse ranking in the multi-armed bandits (MAB) setting, where we
have a set of arms each of which is associated with an unknown distribution. The task is to …

被引用次数：9 相关文章所有 5 个版本