A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear...

文章

学术资源搜索

获得 2 条结果（用时0.02秒）

我的图书馆

A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear...

在引用文章中搜索

[PDF] arxiv.org

Non-stochastic Bandits With Evolving Observations

Y Bar-On, Y Mansour - arXiv preprint arXiv:2405.16843, 2024 - arxiv.org

We introduce a novel online learning framework that unifies and generalizes pre-
established models, such as delayed and corrupted feedback, to encompass adversarial …

Regret Guarantees for Adversarial Contextual Bandits with Delayed Feedback

L Erez, O Levy, Y Mansour - Seventeenth European Workshop on … - openreview.net

In this paper we present regret minimization algorithms for the contextual multi-armed bandit
(CMAB) problem in the presence of delayed feedback, a scenario where reward …