Algorithmic chaining and the role of partial feedback in online nonparametric learning

A Slivkins - Foundations and Trends® in Machine Learning, 2019 - nowpublishers.com

Multi-armed bandits a simple but very powerful framework for algorithms that make
decisions over time under uncertainty. An enormous body of work has accumulated over the …

被引用次数：1073 相关文章所有 7 个版本

[PDF] arxiv.org

Nonstochastic multi-armed bandits with graph-structured feedback

N Alon, N Cesa-Bianchi, C Gentile, S Mannor… - SIAM Journal on …, 2017 - SIAM

We introduce and study a partial-information model of online learning, where a decision
maker repeatedly chooses from a finite set of actions and observes some subset of the …

被引用次数：147 相关文章所有 30 个版本

[PDF] mlr.press

Optimal no-regret learning for one-sided lipschitz functions

P Dütting, G Guruganesh… - … on Machine Learning, 2023 - proceedings.mlr.press

Inspired by applications in pricing and contract design, we study the maximization of one-
sided Lipschitz functions, which only provide the (weaker) guarantee that they do not grow …

被引用次数：13 相关文章所有 3 个版本

[PDF] mlr.press

Fair contextual multi-armed bandits: Theory and experiments

Y Chen, A Cuellar, H Luo, J Modi… - … on Uncertainty in …, 2020 - proceedings.mlr.press

When an AI system interacts with multiple users, it frequently needs to make allocation
decisions. For instance, a virtual agent decides whom to pay attention to in a group, or a …

被引用次数：67 相关文章所有 11 个版本

[PDF] jmlr.org

Contextual bandits with continuous actions: Smoothing, zooming, and adapting

A Krishnamurthy, J Langford, A Slivkins… - Journal of Machine …, 2020 - jmlr.org

We study contextual bandit learning with an abstract policy class and continuous action
space. We obtain two qualitatively different regret bounds: one competes with a smoothed …

被引用次数：77 相关文章所有 8 个版本

[PDF] arxiv.org

Learning to bid optimally and efficiently in adversarial first-price auctions

Y Han, Z Zhou, A Flores, E Ordentlich… - arXiv preprint arXiv …, 2020 - arxiv.org

First-price auctions have very recently swept the online advertising industry, replacing
second-price auctions as the predominant auction mechanism on many platforms. This shift …

被引用次数：39 相关文章所有 4 个版本

[PDF] arxiv.org

Optimal no-regret learning in repeated first-price auctions

Y Han, T Weissman, Z Zhou - Operations Research, 2024 - pubsonline.informs.org

We study online learning in repeated first-price auctions where a bidder, only observing the
winning bid at the end of each auction, learns to adaptively bid to maximize the cumulative …

被引用次数：43 相关文章所有 3 个版本

[PDF] jmlr.org

Chaining meets chain rule: Multilevel entropic regularization and training of neural networks

AR Asadi, E Abbe - Journal of Machine Learning Research, 2020 - jmlr.org

We derive generalization and excess risk bounds for neural networks using a family of
complexity measures based on a multilevel relative entropy. The bounds are obtained by …

被引用次数：34 相关文章所有 5 个版本

[PDF] neurips.cc

Contextual pricing for lipschitz buyers

J Mao, R Leme, J Schneider - Advances in Neural …, 2018 - proceedings.neurips.cc

We investigate the problem of learning a Lipschitz function from binary feedback. In this
problem, a learner is trying to learn a Lipschitz function $ f:[0, 1]^ d\rightarrow [0, 1] $ over …

被引用次数：52 相关文章所有 6 个版本

[PDF] neurips.cc

Efficient contextual bandits with continuous actions

M Majzoubi, C Zhang, R Chari… - Advances in …, 2020 - proceedings.neurips.cc

We create a computationally tractable learning algorithm for contextual bandits with
continuous actions having unknown structure. The new reduction-style algorithm composes …

被引用次数：33 相关文章所有 5 个版本