Provable benefit of multitask representation learning in reinforcement learning
As representation learning becomes a powerful technique to reduce sample complexity in
reinforcement learning (RL) in practice, theoretical understanding of its advantage is still …
reinforcement learning (RL) in practice, theoretical understanding of its advantage is still …
Meta Learning in Bandits within shared affine Subspaces
S Bilaj, S Dhouib, S Maghsudi - International Conference on …, 2024 - proceedings.mlr.press
We study the problem of meta-learning several contextual stochastic bandits tasks by
leveraging their concentration around a low dimensional affine subspace, which we learn …
leveraging their concentration around a low dimensional affine subspace, which we learn …
Meta-learning adversarial bandits
We study online learning with bandit feedback across multiple tasks, with the goal of
improving average performance across tasks if they are similar according to some natural …
improving average performance across tasks if they are similar according to some natural …
Lifelong Best-Arm Identification with Misspecified Priors
N Nguyen, C Vernade - Sixteenth European Workshop on …, 2023 - openreview.net
We address the problem of lifelong fixed-budget best-arm identification (BAI), which arises in
realistic sequential A/B testing scenarios where the value of each arm is correlated across …
realistic sequential A/B testing scenarios where the value of each arm is correlated across …
Online meta-learning in adversarial multi-armed bandits
We study meta-learning for adversarial multi-armed bandits. We consider the online-within-
online setup, in which a player (learner) encounters a sequence of multi-armed bandit …
online setup, in which a player (learner) encounters a sequence of multi-armed bandit …
Transfer learning in bandits with latent continuity
A continuity structure of correlations among arms in multi-armed bandit can bring a
significant acceleration of exploration and reduction of regret, in particular, when there are …
significant acceleration of exploration and reduction of regret, in particular, when there are …
Beyond task diversity: provable representation transfer for sequential multitask linear bandits
We study lifelong learning in linear bandits, where a learner interacts with a sequence of
linear bandit tasks whose parameters lie in an $ m $-dimensional subspace of $\mathbb …
linear bandit tasks whose parameters lie in an $ m $-dimensional subspace of $\mathbb …