Non-stationary representation learning in sequential linear bandits

Y Qin, T Menara, S Oymak, SN Ching… - IEEE Open Journal of …, 2022 - ieeexplore.ieee.org
In this paper, we study representation learning for multi-task decision-making in non-
stationary environments. We consider the framework of sequential linear bandits, where the …

Stochastic contextual bandits with long horizon rewards

Y Qin, Y Li, F Pasqualetti, M Fazel… - Proceedings of the AAAI …, 2023 - ojs.aaai.org
The growing interest in complex decision-making and language modeling problems
highlights the importance of sample-efficient learning over very long horizons. This work …

An Adaptive Method for Non-Stationary Stochastic Multi-armed Bandits with Rewards Generated by a Linear Dynamical System

J Gornet, M Hosseinzadeh, B Sinopoli - arXiv preprint arXiv:2406.10418, 2024 - arxiv.org
Online decision-making can be formulated as the popular stochastic multi-armed bandit
problem where a learner makes decisions (or takes actions) to maximize cumulative …