Distributed exploration in multi-armed bandits

Pervasive AI for IoT applications: A survey on resource-efficient distributed artificial intelligence

E Baccour, N Mhaisen, AA Abdellatif… - … Surveys & Tutorials, 2022 - ieeexplore.ieee.org

Artificial intelligence (AI) has witnessed a substantial breakthrough in a variety of Internet of
Things (IoT) applications and services, spanning from recommendation systems and speech …

被引用次数：129 相关文章所有 10 个版本

[PDF] neurips.cc

Distributed multi-player bandits-a game of thrones approach

I Bistritz, A Leshem - Advances in Neural Information …, 2018 - proceedings.neurips.cc

We consider a multi-armed bandit game where N players compete for K arms for T turns.
Each player has different expected rewards for the arms, and the instantaneous rewards are …

被引用次数：163 相关文章所有 6 个版本

[PDF] acm.org

Social learning in multi agent multi armed bandits

A Sankararaman, A Ganesh, S Shakkottai - Proceedings of the ACM on …, 2019 - dl.acm.org

Motivated by emerging need of learning algorithms for large scale networked and
decentralized systems, we introduce a distributed version of the classical stochastic Multi …

被引用次数：100 相关文章所有 10 个版本

[PDF] neurips.cc

Decentralized cooperative stochastic bandits

D Martínez-Rubio, V Kanade… - Advances in Neural …, 2019 - proceedings.neurips.cc

We study a decentralized cooperative stochastic multi-armed bandit problem with K arms on
a network of N agents. In our model, the reward distribution of each arm is the same for each …

被引用次数：122 相关文章所有 8 个版本

[PDF] acm.org

Fast distributed bandits for online recommendation systems

K Mahadik, Q Wu, S Li, A Sabne - Proceedings of the 34th ACM …, 2020 - dl.acm.org

Contextual bandit algorithms are commonly used in recommender systems, where content
popularity can change rapidly. These algorithms continuously learn latent mappings …

被引用次数：71 相关文章所有 4 个版本

[PDF] arxiv.org

Distributed bandit learning: Near-optimal regret with efficient communication

Y Wang, J Hu, X Chen, L Wang - arXiv preprint arXiv:1904.06309, 2019 - arxiv.org

We study the problem of regret minimization for distributed bandits learning, in which $ M $
agents work collaboratively to minimize their total regret under the coordination of a central …

被引用次数：98 相关文章所有 3 个版本

[PDF] mlr.press

Learning with limited rounds of adaptivity: Coin tossing, multi-armed bandits, and ranking from pairwise comparisons

A Agarwal, S Agarwal, S Assadi… - … on Learning Theory, 2017 - proceedings.mlr.press

In many learning settings, active/adaptive querying is possible, but the number of rounds of
adaptivity is limited. We study the relationship between query complexity and adaptivity in …

被引用次数：117 相关文章所有 6 个版本

[PDF] arxiv.org

Linear bandits with limited adaptivity and learning distributional optimal design

Y Ruan, J Yang, Y Zhou - Proceedings of the 53rd Annual ACM SIGACT …, 2021 - dl.acm.org

Motivated by practical needs such as large-scale learning, we study the impact of adaptivity
constraints to linear contextual bandits, a central problem in online learning and decision …

被引用次数：60 相关文章所有 6 个版本

[PDF] neurips.cc

Near-optimal collaborative learning in bandits

C Réda, S Vakili, E Kaufmann - Advances in Neural …, 2022 - proceedings.neurips.cc

This paper introduces a general multi-agent bandit model in which each agent is facing a
finite set of arms and may communicate with other agents through a central controller in …

被引用次数：21 相关文章所有 11 个版本

[PDF] mlr.press

Beyond regret for decentralized bandits in matching markets

S Basu, KA Sankararaman… - … on Machine Learning, 2021 - proceedings.mlr.press

We design decentralized algorithms for regret minimization in the two sided matching market
with one-sided bandit feedback that significantly improves upon the prior works (Liu et al …

被引用次数：47 相关文章所有 5 个版本