Fixed-Budget Best-Arm Identification in Contextual Bandits: A Static-Adaptive Algorithm

文章

学术资源搜索

获得 2 条结果（用时0.02秒）

我的图书馆

Fixed-Budget Best-Arm Identification in Contextual Bandits: A Static-Adaptive Algorithm

在引用文章中搜索

[PDF] mlr.press

Safe exploration for efficient policy evaluation and comparison

R Wan, B Kveton, R Song - International Conference on …, 2022 - proceedings.mlr.press

High-quality data plays a central role in ensuring the accuracy of policy evaluation. This
paper initiates the study of efficient and safe data collection for bandit policy evaluation. We …

被引用次数：15 相关文章所有 8 个版本

[PDF] mlr.press

Safe optimal design with applications in off-policy learning

R Zhu, B Kveton - International Conference on Artificial …, 2022 - proceedings.mlr.press

Motivated by practical needs in online experimentation and off-policy learning, we study the
problem of safe optimal design, where we develop a data logging policy that efficiently …

被引用次数：10 相关文章所有 2 个版本