Adversarial bandits with knapsacks

A Slivkins - Foundations and Trends® in Machine Learning, 2019 - nowpublishers.com

Multi-armed bandits a simple but very powerful framework for algorithms that make
decisions over time under uncertainty. An enormous body of work has accumulated over the …

被引用次数：1232 相关文章所有 7 个版本

[PDF] jmlr.org

Achieving fairness in the stochastic multi-armed bandit problem

V Patil, G Ghalme, V Nair, Y Narahari - Journal of Machine Learning …, 2021 - jmlr.org

We study an interesting variant of the stochastic multi-armed bandit problem, which we call
the Fair-MAB problem, where, in addition to the objective of maximizing the sum of expected …

被引用次数：140 相关文章所有 17 个版本

[PDF] mlr.press

No-regret learning in time-varying zero-sum games

M Zhang, P Zhao, H Luo… - … Conference on Machine …, 2022 - proceedings.mlr.press

Learning from repeated play in a fixed two-player zero-sum game is a classic problem in
game theory and online learning. We consider a variant of this problem where the game …

被引用次数：46 相关文章所有 7 个版本

[PDF] neurips.cc

A unifying framework for online optimization with long-term constraints

M Castiglioni, A Celli, A Marchesi… - Advances in Neural …, 2022 - proceedings.neurips.cc

We study online learning problems in which a decision maker has to take a sequence of
decisions subject to $ m $ long-term constraints. The goal of the decision maker is to …

被引用次数：32 相关文章所有 8 个版本

[PDF] acm.org

[PDF][PDF] No-regret learning in bilateral trade via global budget balance

M Bernasconi, M Castiglioni, A Celli… - Proceedings of the 56th …, 2024 - dl.acm.org

Bilateral trade models the problem of intermediating between two rational agents—a seller
and a buyer—both characterized by a private valuation for an item they want to trade. We …

被引用次数：10 相关文章所有 3 个版本

[PDF] mlr.press

Versatile dueling bandits: Best-of-both world analyses for learning from relative preferences

A Saha, P Gaillard - International Conference on Machine …, 2022 - proceedings.mlr.press

We study the problem of $ K $-armed dueling bandit for both stochastic and adversarial
environments, where the goal of the learner is to aggregate information through relative …

被引用次数：23 相关文章所有 2 个版本

[PDF] neurips.cc

Learning equilibria in matching markets from bandit feedback

M Jagadeesan, A Wei, Y Wang… - Advances in …, 2021 - proceedings.neurips.cc

Large-scale, two-sided matching platforms must find market outcomes that align with user
preferences while simultaneously learning these preferences from data. But since …

被引用次数：48 相关文章所有 11 个版本

[PDF] mlr.press

Learning to bid in repeated first-price auctions with budgets

Q Wang, Z Yang, X Deng… - … Conference on Machine …, 2023 - proceedings.mlr.press

Budget management strategies in repeated auctions have received growing attention in
online advertising markets. However, previous work on budget management in online …

被引用次数：13 相关文章所有 6 个版本

[PDF] mlr.press

Autobidders with budget and roi constraints: Efficiency, regret, and pacing dynamics

B Lucier, S Pattathil, A Slivkins… - The Thirty Seventh …, 2024 - proceedings.mlr.press

We study a game between autobidding algorithms that compete in an online advertising
platform. Each autobidder is tasked with maximizing its advertiser's total value over multiple …

被引用次数：22 相关文章所有 2 个版本

[PDF] neurips.cc

Adversarial attacks on linear contextual bandits

E Garcelon, B Roziere, L Meunier… - Advances in …, 2020 - proceedings.neurips.cc

Contextual bandit algorithms are applied in a wide range of domains, from advertising to
recommender systems, from clinical trials to education. In many of these domains, malicious …

被引用次数：63 相关文章所有 9 个版本