On the complexity of representation learning in contextual linear bandits

A Tirinzoni, M Pirotta, A Lazaric - … Conference on Artificial …, 2023 - proceedings.mlr.press
In contextual linear bandits, the reward function is assumed to be a linear combination of an
unknown reward vector and a given embedding of context-arm pairs. In practice, the …

Representation Abstractions as Incentives for Reinforcement Learning Agents: A Robotic Grasping Case Study

P Petropoulakis, L Gräf, J Josifovski, M Malmir… - arXiv preprint arXiv …, 2023 - arxiv.org
Choosing an appropriate representation of the environment for the underlying decision-
making process of the\gls {RL} agent is not always straightforward. The state representation …

Bounded (o (1)) regret recommendation learning via synthetic controls oracle

EH Kang, PR Kumar - 2023 59th Annual Allerton Conference …, 2023 - ieeexplore.ieee.org
In online exploration systems where users with fixed preferences repeatedly arrive, it has
recently been shown that O (1), ie, bounded regret, can be achieved when the system is …