Efficient and robust algorithms for adversarial linear contextual bandits

D Foster, A Rakhlin - International Conference on Machine …, 2020 - proceedings.mlr.press

A fundamental challenge in contextual bandits is to develop flexible, general-purpose
algorithms with computational requirements no worse than classical supervised learning …

被引用次数：208 相关文章所有 6 个版本

[PDF] neurips.cc

Bypassing the simulator: Near-optimal adversarial linear contextual bandits

H Liu, CY Wei, J Zimmert - Advances in Neural Information …, 2024 - proceedings.neurips.cc

We consider the adversarial linear contextual bandit problem, where the loss vectors are
selected fully adversarially and the per-round action set (ie the context) is drawn from a fixed …

被引用次数：9 相关文章所有 5 个版本

[PDF] mlr.press

Breaking the curse of multiagency: Provably efficient decentralized multi-agent rl with function approximation

Y Wang, Q Liu, Y Bai, C Jin - The Thirty Sixth Annual …, 2023 - proceedings.mlr.press

A unique challenge in Multi-Agent Reinforcement Learning (MARL) is the\emph {curse of
multiagency}, where the description length of the game as well as the complexity of many …

被引用次数：28 相关文章所有 4 个版本

[PDF] neurips.cc

Misspecified gaussian process bandit optimization

I Bogunovic, A Krause - Advances in neural information …, 2021 - proceedings.neurips.cc

We consider the problem of optimizing a black-box function based on noisy bandit feedback.
Kernelized bandit algorithms have shown strong empirical and theoretical performance for …

被引用次数：47 相关文章所有 9 个版本

[PDF] mlr.press

Breaking the curse of multiagents in a large state space: Rl in markov games with independent linear function approximation

Q Cui, K Zhang, S Du - The Thirty Sixth Annual Conference …, 2023 - proceedings.mlr.press

We propose a new model,\emph {independent linear Markov game}, for multi-agent
reinforcement learning with a large state space and a large number of agents. This is a class …

被引用次数：25 相关文章所有 5 个版本

[PDF] mlr.press

Stochastic linear bandits robust to adversarial attacks

I Bogunovic, A Losalka, A Krause… - International …, 2021 - proceedings.mlr.press

We consider a stochastic linear bandit problem in which the rewards are not only subject to
random noise, but also adversarial attacks subject to a suitable budget $ C $(ie, an upper …

被引用次数：76 相关文章所有 7 个版本

[PDF] mlr.press

Refined regret for adversarial mdps with linear function approximation

Y Dai, H Luo, CY Wei, J Zimmert - … Conference on Machine …, 2023 - proceedings.mlr.press

We consider learning in an adversarial Markov Decision Process (MDP) where the loss
functions can change arbitrarily over $ K $ episodes and the state space can be arbitrarily …

被引用次数：17 相关文章所有 10 个版本

[PDF] mlr.press

Improved regret for efficient online reinforcement learning with linear function approximation

U Sherman, T Koren… - … Conference on Machine …, 2023 - proceedings.mlr.press

We study reinforcement learning with linear function approximation and adversarially
changing cost functions, a setup that has mostly been considered under simplifying …

被引用次数：15 相关文章所有 6 个版本

[PDF] neurips.cc

First-and second-order bounds for adversarial linear contextual bandits

J Olkhovskaya, J Mayo, T van Erven… - Advances in Neural …, 2024 - proceedings.neurips.cc

We consider the adversarial linear contextual bandit setting, whichallows for the loss
functions associated with each of $ K $ arms to changeover time without restriction …

被引用次数：10 相关文章所有 8 个版本

[PDF] mlr.press

Offline primal-dual reinforcement learning for linear mdps

G Gabbianelli, G Neu, M Papini… - … Conference on Artificial …, 2024 - proceedings.mlr.press

Abstract Offline Reinforcement Learning (RL) aims to learn a near-optimal policy from a fixed
dataset of transitions collected by another policy. This problem has attracted a lot of attention …

被引用次数：8 相关文章所有 6 个版本