Parallelised Bayesian optimisation via Thompson sampling
K Kandasamy, A Krishnamurthy… - International …, 2018 - proceedings.mlr.press
We design and analyse variations of the classical Thompson sampling (TS) procedure for
Bayesian optimisation (BO) in settings where function evaluations are expensive but can be …
Bayesian optimisation (BO) in settings where function evaluations are expensive but can be …
Batched multi-armed bandits problem
In this paper, we study the multi-armed bandit problem in the batched setting where the
employed policy must split data into a small number of batches. While the minimax regret for …
employed policy must split data into a small number of batches. While the minimax regret for …
Inference for batched bandits
As bandit algorithms are increasingly utilized in scientific studies and industrial applications,
there is an associated increasing need for reliable inference methods based on the resulting …
there is an associated increasing need for reliable inference methods based on the resulting …
Linear bandits with limited adaptivity and learning distributional optimal design
Motivated by practical needs such as large-scale learning, we study the impact of adaptivity
constraints to linear contextual bandits, a central problem in online learning and decision …
constraints to linear contextual bandits, a central problem in online learning and decision …
Learning with limited rounds of adaptivity: Coin tossing, multi-armed bandits, and ranking from pairwise comparisons
In many learning settings, active/adaptive querying is possible, but the number of rounds of
adaptivity is limited. We study the relationship between query complexity and adaptivity in …
adaptivity is limited. We study the relationship between query complexity and adaptivity in …
Stochastic bandit models for delayed conversions
Online advertising and product recommendation are important domains of applications for
multi-armed bandit methods. In these fields, the reward that is immediately available is most …
multi-armed bandit methods. In these fields, the reward that is immediately available is most …
Regret bounds for batched bandits
We present simple algorithms for batched stochastic multi-armed bandit and batched
stochastic linear bandit problems. We prove bounds for their expected regrets that improve …
stochastic linear bandit problems. We prove bounds for their expected regrets that improve …
Revisiting simple regret: Fast rates for returning a good arm
Simple regret is a natural and parameter-free performance criterion for pure exploration in
multi-armed bandits yet is less popular than the probability of missing the best arm or an …
multi-armed bandits yet is less popular than the probability of missing the best arm or an …
Best arm identification in multi-armed bandits with delayed feedback
In this paper, we propose a generalization of the best arm identification problem in
stochastic multi-armed bandits (MAB) to the setting where every pull of an arm is associated …
stochastic multi-armed bandits (MAB) to the setting where every pull of an arm is associated …
Adaptive algorithms for relaxed pareto set identification
C Kone, E Kaufmann, L Richert - Advances in Neural …, 2023 - proceedings.neurips.cc
In this paper we revisit the fixed-confidence identification of the Pareto optimal set in a multi-
objective multi-armed bandit model. As the sample complexity to identify the exact Pareto set …
objective multi-armed bandit model. As the sample complexity to identify the exact Pareto set …