Cooperative stochastic bandits with asynchronous agents and constrained feedback

L Yang, YZJ Chen, S Pasteris… - Advances in …, 2021 - proceedings.neurips.cc
This paper studies a cooperative multi-armed bandit problem with $ M $ agents cooperating
together to solve the same instance of a $ K $-armed stochastic bandit problem with the goal …

Fair exploration via axiomatic bargaining

J Baek, V Farias - Advances in Neural Information …, 2021 - proceedings.neurips.cc
Motivated by the consideration of fairly sharing the cost of exploration between multiple
groups in learning problems, we develop the Nash bargaining solution in the context of multi …

On-demand communication for asynchronous multi-agent bandits

YZJ Chen, L Yang, X Wang, X Liu… - International …, 2023 - proceedings.mlr.press
This paper studies a cooperative multi-agent multi-armed stochastic bandit problem where
agents operate asynchronously–agent pull times and rates are unknown, irregular, and …

Exploration for free: how does reward heterogeneity improve regret in cooperative multi-agent bandits?

X Wang, L Yang, YZJ Chen, X Liu… - Uncertainty in …, 2023 - proceedings.mlr.press
This paper studies a cooperative multi-agent bandit scenario in which the rewards observed
by agents are heterogeneous—one agent's meat can be another agent's poison …

Cooperative multi-agent bandits: Distributed algorithms with optimal individual regret and constant communication costs

L Yang, X Wang, M Hajiesmaili, L Zhang, J Lui… - arXiv preprint arXiv …, 2023 - arxiv.org
Recently, there has been extensive study of cooperative multi-agent multi-armed bandits
where a set of distributed agents cooperatively play the same multi-armed bandit game. The …

Adversarial Attacks on Cooperative Multi-agent Bandits

J Zuo, Z Zhang, X Wang, C Chen, S Li, J Lui… - arXiv preprint arXiv …, 2023 - arxiv.org
Cooperative multi-agent multi-armed bandits (CMA2B) consider the collaborative efforts of
multiple agents in a shared multi-armed bandit game. We study latent vulnerabilities …

Human-in-the-loop Learning for Dynamic Congestion Games

H Li, L Duan - IEEE Transactions on Mobile Computing, 2024 - ieeexplore.ieee.org
Today mobile users learn and share their traffic observations via crowdsourcing platforms
(eg, Google Maps and Waze). Yet such platforms simply cater to selfish users' myopic …

Pure Exploration in Asynchronous Federated Bandits

Z Wang, C Li, C Song, L Wang, Q Gu… - arXiv preprint arXiv …, 2023 - arxiv.org
We study the federated pure exploration problem of multi-armed bandits and linear bandits,
where $ M $ agents cooperatively identify the best arm via communicating with the central …

Linear Bandits With Side Observations on Networks

A Kar, R Singh, F Liu, X Liu… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org
We investigate linear bandits in a network setting in the presence of side-observations
across nodes in order to design recommendation algorithms for users connected via social …

Combining Diverse Information for Coordinated Action: Stochastic Bandit Algorithms for Heterogeneous Agents

L Gordon, E Rolf, M Tambe - arXiv preprint arXiv:2408.03405, 2024 - arxiv.org
Stochastic multi-agent multi-armed bandits typically assume that the rewards from each arm
follow a fixed distribution, regardless of which agent pulls the arm. However, in many real …