Learning on the edge: Online learning with stochastic feedback graphs

E Esposito, F Fusco… - Advances in …, 2022 - proceedings.neurips.cc
The framework of feedback graphs is a generalization of sequential decision-making with
bandit or full information feedback. In this work, we study an extension where the directed …

A near-optimal best-of-both-worlds algorithm for online learning with feedback graphs

C Rouyer, D van der Hoeven… - Advances in …, 2022 - proceedings.neurips.cc
We consider online learning with feedback graphs, a sequential decision-making framework
where the learner's feedback is determined by a directed graph over the action set. We …

An -regret analysis of Adversarial Bilateral Trade

Y Azar, A Fiat, F Fusco - Advances in Neural Information …, 2022 - proceedings.neurips.cc
We study sequential bilateral trade where sellers and buyers valuations are completely
arbitrary ({\sl ie}, determined by an adversary). Sellers and buyers are strategic agents with …

Online learning with set-valued feedback

V Raman, U Subedi, A Tewari - The Thirty Seventh Annual …, 2024 - proceedings.mlr.press
We study a variant of online multiclass classification where the learner predicts a single
label but receives a\textit {set of labels} as feedback. In this model, the learner is penalized …

Practical contextual bandits with feedback graphs

M Zhang, Y Zhang, O Vrousgou… - Advances in Neural …, 2024 - proceedings.neurips.cc
While contextual bandit has a mature theory, effectively leveraging different feedback
patterns to enhance the pace of learning remains unclear. Bandits with feedback graphs …

A regret-variance trade-off in online learning

D Van der Hoeven, N Zhivotovskiy… - Advances in Neural …, 2022 - proceedings.neurips.cc
We consider prediction with expert advice for strongly convex and bounded losses, and
investigate trade-offs between regret and``variance''(ie, squared difference of learner's …

Online Structured Prediction with Fenchel--Young Losses and Improved Surrogate Regret for Online Multiclass Classification with Logistic Loss

S Sakaue, H Bao, T Tsuchiya, T Oki - arXiv preprint arXiv:2402.08180, 2024 - arxiv.org
This paper studies online structured prediction with full-information feedback. For online
multiclass classification, van der Hoeven (2020) has obtained surrogate regret bounds …

Neural Active Learning Meets the Partial Monitoring Framework

M Heuillet, O Ahmad, A Durand - arXiv preprint arXiv:2405.08921, 2024 - arxiv.org
We focus on the online-based active learning (OAL) setting where an agent operates over a
stream of observations and trades-off between the costly acquisition of information (labelled …

Trading-off payments and accuracy in online classification with paid stochastic experts

D Van Der Hoeven, C Pike-Burke… - International …, 2023 - proceedings.mlr.press
We investigate online classification with paid stochastic experts. Here, before making their
prediction, each expert must be paid. The amount that we pay each expert directly …

[HTML][HTML] An α-regret analysis of adversarial bilateral trade

Y Azar, A Fiat, F Fusco - Artificial Intelligence, 2024 - Elsevier
We study sequential bilateral trade where sellers and buyers valuations are completely
arbitrary (ie, determined by an adversary). Sellers and buyers are strategic agents with …