Parallelised Bayesian optimisation via Thompson sampling

K Kandasamy, A Krishnamurthy… - International …, 2018 - proceedings.mlr.press
We design and analyse variations of the classical Thompson sampling (TS) procedure for
Bayesian optimisation (BO) in settings where function evaluations are expensive but can be …

Batched multi-armed bandits problem

Z Gao, Y Han, Z Ren, Z Zhou - Advances in Neural …, 2019 - proceedings.neurips.cc
In this paper, we study the multi-armed bandit problem in the batched setting where the
employed policy must split data into a small number of batches. While the minimax regret for …

Inference for batched bandits

K Zhang, L Janson, S Murphy - Advances in neural …, 2020 - proceedings.neurips.cc
As bandit algorithms are increasingly utilized in scientific studies and industrial applications,
there is an associated increasing need for reliable inference methods based on the resulting …

Linear bandits with limited adaptivity and learning distributional optimal design

Y Ruan, J Yang, Y Zhou - Proceedings of the 53rd Annual ACM SIGACT …, 2021 - dl.acm.org
Motivated by practical needs such as large-scale learning, we study the impact of adaptivity
constraints to linear contextual bandits, a central problem in online learning and decision …

Learning with limited rounds of adaptivity: Coin tossing, multi-armed bandits, and ranking from pairwise comparisons

A Agarwal, S Agarwal, S Assadi… - … on Learning Theory, 2017 - proceedings.mlr.press
In many learning settings, active/adaptive querying is possible, but the number of rounds of
adaptivity is limited. We study the relationship between query complexity and adaptivity in …

Stochastic bandit models for delayed conversions

C Vernade, O Cappé, V Perchet - arXiv preprint arXiv:1706.09186, 2017 - arxiv.org
Online advertising and product recommendation are important domains of applications for
multi-armed bandit methods. In these fields, the reward that is immediately available is most …

Regret bounds for batched bandits

H Esfandiari, A Karbasi, A Mehrabian… - Proceedings of the AAAI …, 2021 - ojs.aaai.org
We present simple algorithms for batched stochastic multi-armed bandit and batched
stochastic linear bandit problems. We prove bounds for their expected regrets that improve …

Revisiting simple regret: Fast rates for returning a good arm

Y Zhao, C Stephens, C Szepesvári… - … on Machine Learning, 2023 - proceedings.mlr.press
Simple regret is a natural and parameter-free performance criterion for pure exploration in
multi-armed bandits yet is less popular than the probability of missing the best arm or an …

Best arm identification in multi-armed bandits with delayed feedback

A Grover, T Markov, P Attia, N Jin… - International …, 2018 - proceedings.mlr.press
In this paper, we propose a generalization of the best arm identification problem in
stochastic multi-armed bandits (MAB) to the setting where every pull of an arm is associated …

Adaptive algorithms for relaxed pareto set identification

C Kone, E Kaufmann, L Richert - Advances in Neural …, 2023 - proceedings.neurips.cc
In this paper we revisit the fixed-confidence identification of the Pareto optimal set in a multi-
objective multi-armed bandit model. As the sample complexity to identify the exact Pareto set …