[PDF][PDF] No-regret learning in bilateral trade via global budget balance
Bilateral trade models the problem of intermediating between two rational agents—a seller
and a buyer—both characterized by a private valuation for an item they want to trade. We …
and a buyer—both characterized by a private valuation for an item they want to trade. We …
Autobidders with budget and roi constraints: Efficiency, regret, and pacing dynamics
We study a game between autobidding algorithms that compete in an online advertising
platform. Each autobidder is tasked with maximizing its advertiser's total value over multiple …
platform. Each autobidder is tasked with maximizing its advertiser's total value over multiple …
Online Learning under Budget and ROI Constraints via Weak Adaptivity
We study online learning problems in which a decision maker has to make a sequence of
costly decisions, with the goal of maximizing their expected reward while adhering to budget …
costly decisions, with the goal of maximizing their expected reward while adhering to budget …
Dynamic budget throttling in repeated second-price auctions
In today's online advertising markets, a crucial requirement for an advertiser is to control her
total expenditure within a time horizon under some budget. Among various budget control …
total expenditure within a time horizon under some budget. Among various budget control …
Bandits with replenishable knapsacks: the best of both worlds
The bandits with knapsack (BwK) framework models online decision-making problems in
which an agent makes a sequence of decisions subject to resource consumption …
which an agent makes a sequence of decisions subject to resource consumption …
Learning to defer in content moderation: The human-ai interplay
T Lykouris, W Weng - arXiv preprint arXiv:2402.12237, 2024 - arxiv.org
Successful content moderation in online platforms relies on a human-AI collaboration
approach. A typical heuristic estimates the expected harmfulness of a post and uses fixed …
approach. A typical heuristic estimates the expected harmfulness of a post and uses fixed …
Approximately stationary bandits with knapsacks
G Fikioris, É Tardos - The Thirty Sixth Annual Conference on …, 2023 - proceedings.mlr.press
Abstract Bandits with Knapsacks (BwK), the generalization of the Multi-Armed Bandits
problem under global budget constraints, has received a lot of attention in recent years. It …
problem under global budget constraints, has received a lot of attention in recent years. It …
The Relative Value of Prediction in Algorithmic Decision Making
JC Perdomo - arXiv preprint arXiv:2312.08511, 2023 - arxiv.org
Algorithmic predictions are increasingly used to inform the allocations of goods and
interventions in the public sphere. In these domains, predictions serve as a means to an …
interventions in the public sphere. In these domains, predictions serve as a means to an …
Multi-armed bandits with guaranteed revenue per arm
Abstract We consider a Multi-Armed Bandit problem with covering constraints, where the
primary goal is to ensure that each arm receives a minimum expected reward while …
primary goal is to ensure that each arm receives a minimum expected reward while …
Stochastic Constrained Contextual Bandits via Lyapunov Optimization Based Estimation to Decision Framework
This paper studies the problem of stochastic constrained contextual bandits (CCB) under
general realizability condition where the expected rewards and costs are within general …
general realizability condition where the expected rewards and costs are within general …