作者
Enrique Gonzalez, Josep Ginebra
发表日期
2019/4/30
图书
Advances on Methodological and Applied Aspects of Probability and Statistics
页码范围
413-423
出版商
CRC Press
简介
Assume we can control an input xn ∈ C ⊂ R, and observe one response yn such that E[yn|xn,β] = f(xn ;β) and that the objective is to keep all the responses close to a target T. We propose sequential designs that always improve on Bayesian certainty equivalence designs by searching for the best design in a family that contains them. To regulate the distance and direction that they move away from the certainty equivalence choice, the new designs experiment on a credible region for the root of ƒ(x; β) = T. These heuristics perturb certainty equivalence to incentive ‘active’ learning about β and improve future control. We also describe how to apply this approach to the response surface bandit, where we need to keep all the responses close to the maximum of ƒ(x; β).
引用总数
200120022003200420052006200711
学术搜索中的文章
E Gonzalez, J Ginebra - Advances on Methodological and Applied Aspects of …, 2019