Parallelizing contextual bandits

C Fannjiang, J Listgarten - Cold Spring Harbor …, 2024 - cshperspectives.cshlp.org

Machine learning–based design has gained traction in the sciences, most notably in the
design of small molecules, materials, and proteins, with societal applications ranging from …

被引用次数：12 相关文章所有 5 个版本

[PDF] neurips.cc

Federated linear contextual bandits

R Huang, W Wu, J Yang… - Advances in neural …, 2021 - proceedings.neurips.cc

This paper presents a novel federated linear contextual bandits model, where individual
clients face different $ K $-armed stochastic bandits coupled through common global …

被引用次数：81 相关文章所有 9 个版本

[PDF] neurips.cc

Experiment planning with function approximation

A Pacchiano, J Lee, E Brunskill - Advances in Neural …, 2024 - proceedings.neurips.cc

We study the problem of experiment planning with function approximation in contextual
bandit problems. In settings where there is a significant overhead to deploying adaptive …

被引用次数：4 相关文章所有 6 个版本

[HTML] sciencedirect.com

[HTML][HTML] Online learning of energy consumption for navigation of electric vehicles

N Åkerblom, Y Chen, MH Chehreghani - Artificial Intelligence, 2023 - Elsevier

Energy efficient navigation constitutes an important challenge in electric vehicles, due to
their limited battery capacity. We employ a Bayesian approach to model the energy …

被引用次数：14 相关文章所有 8 个版本

[PDF] arxiv.org

Harnessing the Power of Federated Learning in Federated Contextual Bandits

C Shi, R Zhou, K Yang, C Shen - arXiv preprint arXiv:2312.16341, 2023 - arxiv.org

Federated learning (FL) has demonstrated great potential in revolutionizing distributed
machine learning, and tremendous efforts have been made to extend it beyond the original …

被引用次数：1 相关文章所有 4 个版本

[PDF] arxiv.org

Contextual Bandits with Stage-wise Constraints

A Pacchiano, M Ghavamzadeh, P Bartlett - arXiv preprint arXiv …, 2024 - arxiv.org

We study contextual bandits in the presence of a stage-wise constraint (a constraint at each
round), when the constraint must be satisfied both with high probability and in expectation …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Neural design for genetic perturbation experiments

A Pacchiano, D Wulsin, RA Barton, L Voloch - arXiv preprint arXiv …, 2022 - arxiv.org

The problem of how to genetically modify cells in order to maximize a certain cellular
phenotype has taken center stage in drug development over the last few years (with, for …

被引用次数：5 相关文章所有 7 个版本

[PDF] mlr.press

One policy is enough: Parallel exploration with a single policy is near-optimal for reward-free reinforcement learning

P Cisneros-Velarde, B Lyu… - International …, 2023 - proceedings.mlr.press

Although parallelism has been extensively used in Reinforcement Learning (RL), the
quantitative effects of parallel exploration are not well understood theoretically. We study the …

被引用次数：4 相关文章所有 6 个版本

[PDF] arxiv.org

Randomized Exploration in Cooperative Multi-Agent Reinforcement Learning

HL Hsu, W Wang, M Pajic, P Xu - arXiv preprint arXiv:2404.10728, 2024 - arxiv.org

We present the first study on provably efficient randomized exploration in cooperative multi-
agent reinforcement learning (MARL). We propose a unified algorithm framework for …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Second Order Bounds for Contextual Bandits with Function Approximation

A Pacchiano - arXiv preprint arXiv:2409.16197, 2024 - arxiv.org

Many works have developed algorithms no-regret algorithms for contextual bandits with
function approximation, where the mean rewards over context-action pairs belongs to a …