Discounted thompson sampling for non-stationary bandit problems

文章

学术资源搜索

获得 2 条结果（用时0.02秒）

我的图书馆

Discounted thompson sampling for non-stationary bandit problems

在引用文章中搜索

[PDF] arxiv.org

Task Selection and Assignment for Multi-Modal Multi-Task Dialogue Act Classification with Non-Stationary Multi-Armed Bandits

X He, J Chen, BW Schuller - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org

Multi-task learning (MTL) aims to improve the performance of a primary task by jointly
learning with related auxiliary tasks. Traditional MTL methods select tasks randomly during …

Sliding-Window Thompson Sampling for Non-Stationary Settings

M Fiandri, AM Metelli, F Trovò - arXiv preprint arXiv:2409.05181, 2024 - arxiv.org

$\textit {Restless Bandits} $ describe sequential decision-making problems in which the
rewards evolve with time independently from the actions taken by the policy-maker. It has …