- 学术资源搜索

Introduction to multi-armed bandits

A Slivkins - Foundations and Trends® in Machine Learning, 2019 - nowpublishers.com

Multi-armed bandits a simple but very powerful framework for algorithms that make
decisions over time under uncertainty. An enormous body of work has accumulated over the …

被引用次数：1108 相关文章所有 7 个版本

[PDF] mlr.press

Dual mirror descent for online allocation problems

S Balseiro, H Lu, V Mirrokni - International Conference on …, 2020 - proceedings.mlr.press

We consider online allocation problems with concave revenue functions and resource
constraints, which are central problems in revenue management and online advertising. In …

被引用次数：137 相关文章所有 10 个版本

[PDF] acm.org

Bandits with knapsacks

A Badanidiyuru, R Kleinberg, A Slivkins - Journal of the ACM (JACM), 2018 - dl.acm.org

Multi-armed bandit problems are the predominant theoretical model of exploration-
exploitation tradeoffs in learning, and they have countless applications ranging from medical …

被引用次数：508 相关文章所有 11 个版本

[PDF] nowpublishers.com

Online matching and ad allocation

A Mehta - … and Trends® in Theoretical Computer Science, 2013 - nowpublishers.com

Matching is a classic problem with a rich history and a significant impact, both on the theory
of algorithms and in practice. Recently there has been a surge of interest in the online …

被引用次数：476 相关文章所有 16 个版本

[PDF] aaai.org

Online task assignment in crowdsourcing markets

CJ Ho, J Vaughan - Proceedings of the AAAI conference on artificial …, 2012 - ojs.aaai.org

We explore the problem of assigning heterogeneous tasks to workers with different,
unknown skill sets in crowdsourcing markets such as Amazon Mechanical Turk. We first …

被引用次数：445 相关文章所有 10 个版本

[PDF] arxiv.org

Real-time bidding for online advertising: measurement and analysis

S Yuan, J Wang, X Zhao - … of the seventh international workshop on data …, 2013 - dl.acm.org

The real-time bidding (RTB), aka programmatic buying, has recently become the fastest
growing area in online advertising. Instead of bulking buying and inventory-centric buying …

被引用次数：345 相关文章所有 10 个版本

[PDF] arxiv.org

A dynamic near-optimal algorithm for online linear programming

S Agrawal, Z Wang, Y Ye - Operations Research, 2014 - pubsonline.informs.org

A natural optimization model that formulates many online resource allocation problems is
the online linear programming (LP) problem in which the constraint matrix is revealed …

被引用次数：353 相关文章所有 20 个版本

[PDF] usc.edu

Real-time optimization of personalized assortments

N Golrezaei, H Nazerzadeh… - Management …, 2014 - pubsonline.informs.org

Motivated by the availability of real-time data on customer characteristics, we consider the
problem of personalizing the assortment of products for each arriving customer. Using actual …

被引用次数：265 相关文章所有 14 个版本

[PDF] arxiv.org

Bandits with concave rewards and convex knapsacks

S Agrawal, NR Devanur - Proceedings of the fifteenth ACM conference …, 2014 - dl.acm.org

In this paper, we consider a very general model for exploration-exploitation tradeoff which
allows arbitrary concave rewards and convex constraints on the decisions across time, in …

被引用次数：231 相关文章所有 6 个版本

[PDF] arxiv.org

Adversarial bandits with knapsacks

N Immorlica, K Sankararaman, R Schapire… - Journal of the ACM, 2022 - dl.acm.org

We consider Bandits with Knapsacks (henceforth, BwK), a general model for multi-armed
bandits under supply/budget constraints. In particular, a bandit algorithm needs to solve a …

被引用次数：124 相关文章所有 12 个版本