相关文章- 学术资源搜索

Eluder dimension and the sample complexity of optimistic exploration

D Russo, B Van Roy - Advances in Neural Information …, 2013 - proceedings.neurips.cc

This paper considers the sample complexity of the multi-armed bandit with dependencies
among the arms. Some of the most successful algorithms for this problem use the principle …

被引用次数：290 相关文章所有 10 个版本

[PDF] arxiv.org

Learning to optimize via posterior sampling

D Russo, B Van Roy - Mathematics of Operations Research, 2014 - pubsonline.informs.org

This paper considers the use of a simple posterior sampling algorithm to balance between
exploration and exploitation when learning to optimize actions such as in multiarmed bandit …

被引用次数：798 相关文章所有 17 个版本

[PDF] jmlr.org

[PDF][PDF] X-Armed Bandits.

S Bubeck, R Munos, G Stoltz, C Szepesvári - Journal of Machine Learning …, 2011 - jmlr.org

We consider a generalization of stochastic bandits where the set of arms, X, is allowed to be
a generic measurable space and the mean-payoff function is “locally Lipschitz” with respect …

被引用次数：520 相关文章所有 32 个版本

[PDF] neurips.cc

Minimal exploration in structured stochastic bandits

R Combes, S Magureanu… - Advances in Neural …, 2017 - proceedings.neurips.cc

This paper introduces and addresses a wide class of stochastic bandit problems where the
function mapping the arm to the corresponding reward exhibits some known structural …

被引用次数：137 相关文章所有 18 个版本

[PDF] mlr.press

Unimodal bandits: Regret lower bounds and optimal algorithms

R Combes, A Proutiere - International Conference on …, 2014 - proceedings.mlr.press

We consider stochastic multi-armed bandits where the expected reward is a unimodal
function over partially ordered arms. This important class of problems has been recently …

被引用次数：184 相关文章所有 12 个版本

[PDF] mlr.press

Information complexity in bandit subset selection

E Kaufmann… - Conference on Learning …, 2013 - proceedings.mlr.press

We consider the problem of efficiently exploring the arms of a stochastic bandit to identify the
best subset. Under the PAC and the fixed-budget formulations, we derive improved bounds …

被引用次数：225 相关文章所有 15 个版本

[PDF] neurips.cc

Thompson sampling and approximate inference

M Phan, Y Abbasi Yadkori… - Advances in Neural …, 2019 - proceedings.neurips.cc

We study the effects of approximate inference on the performance of Thompson sampling in
the $ k $-armed bandit problems. Thompson sampling is a successful algorithm for online …

被引用次数：53 相关文章所有 8 个版本

[PDF] mlr.press

An improved parametrization and analysis of the EXP3++ algorithm for stochastic and adversarial bandits

Y Seldin, G Lugosi - Conference on Learning Theory, 2017 - proceedings.mlr.press

We present a new strategy for gap estimation in randomized algorithms for multiarmed
bandits and combine it with the EXP3++ algorithm of Seldin and Slivkins (2014). In the …

被引用次数：114 相关文章所有 6 个版本

[PDF] mlr.press

Analysis of thompson sampling for the multi-armed bandit problem

S Agrawal, N Goyal - Conference on learning theory, 2012 - proceedings.mlr.press

The multi-armed bandit problem is a popular model for studying exploration/exploitation
trade-off in sequential decision problems. Many algorithms are now available for this well …

被引用次数：1641 相关文章所有 14 个版本

[PDF] jmlr.org

[PDF][PDF] Better Algorithms for Benign Bandits.

E Hazan, S Kale - Journal of Machine Learning Research, 2011 - jmlr.org

The online multi-armed bandit problem and its generalizations are repeated decision
making problems, where the goal is to select one of several possible decisions in every …

被引用次数：98 相关文章所有 19 个版本

Eluder dimension and the sample complexity of optimistic exploration

Learning to optimize via posterior sampling

[PDF][PDF] X-Armed Bandits.

Minimal exploration in structured stochastic bandits

Unimodal bandits: Regret lower bounds and optimal algorithms

Information complexity in bandit subset selection

Thompson sampling and approximate inference

An improved parametrization and analysis of the EXP3++ algorithm for stochastic and adversarial bandits

Analysis of thompson sampling for the multi-armed bandit problem

[PDF][PDF] Better Algorithms for Benign Bandits.

相关搜索

高级搜索

引用