Bandit multi-linear DR-submodular maximization and its applications on adversarial submodular bandits
We investigate the online bandit learning of the monotone multi-linear DR-submodular
functions, designing the algorithm $\mathtt {BanditMLSM} $ that attains $ O (T^{2/3}\log T) …
functions, designing the algorithm $\mathtt {BanditMLSM} $ that attains $ O (T^{2/3}\log T) …
An -regret analysis of Adversarial Bilateral Trade
We study sequential bilateral trade where sellers and buyers valuations are completely
arbitrary ({\sl ie}, determined by an adversary). Sellers and buyers are strategic agents with …
arbitrary ({\sl ie}, determined by an adversary). Sellers and buyers are strategic agents with …
Fair assortment planning
Many online platforms, ranging from online retail stores to social media platforms, employ
algorithms to optimize their offered assortment of items (eg, products and contents). These …
algorithms to optimize their offered assortment of items (eg, products and contents). These …
A framework for adapting offline algorithms to solve combinatorial multi-armed bandit problems with bandit feedback
We investigate the problem of stochastic, combinatorial multi-armed bandits where the
learner only has access to bandit feedback and the reward function can be non-linear. We …
learner only has access to bandit feedback and the reward function can be non-linear. We …
Learning product rankings robust to fake users
In many online platforms, customers' decisions are substantially influenced by product
rankings as most customers only examine a few top-ranked products. Concurrently, such …
rankings as most customers only examine a few top-ranked products. Concurrently, such …
A unified approach for maximizing continuous DR-submodular functions
M Pedramfar, C Quinn… - Advances in Neural …, 2024 - proceedings.neurips.cc
This paper presents a unified approach for maximizing continuous DR-submodular functions
that encompasses a range of settings and oracle access types. Our approach includes a …
that encompasses a range of settings and oracle access types. Our approach includes a …
Contextual bandits with cross-learning
In the classical contextual bandits problem, in each round $ t $, a learner observes some
context $ c $, chooses some action $ a $ to perform, and receives some reward $ r_ {a, t}(c) …
context $ c $, chooses some action $ a $ to perform, and receives some reward $ r_ {a, t}(c) …
An explore-then-commit algorithm for submodular maximization under full-bandit feedback
We investigate the problem of combinatorial multi-armed bandits with stochastic submodular
(in expectation) rewards and full-bandit feedback, where no extra information other than the …
(in expectation) rewards and full-bandit feedback, where no extra information other than the …
Randomized greedy learning for non-monotone stochastic submodular maximization under full-bandit feedback
We investigate the problem of unconstrained combinatorial multi-armed bandits with full-
bandit feedback and stochastic rewards for submodular maximization. Previous works …
bandit feedback and stochastic rewards for submodular maximization. Previous works …
Learning and collusion in multi-unit auctions
S Brânzei, M Derakhshan… - Advances in Neural …, 2023 - proceedings.neurips.cc
In a carbon auction, licenses for CO2 emissions are allocated among multiple interested
players. Inspired by this setting, we consider repeated multi-unit auctions with uniform …
players. Inspired by this setting, we consider repeated multi-unit auctions with uniform …