Large-scale Markov decision problems with KL control cost and its application to crowdsourcing- 学术资源搜索

Large-scale Markov decision problems with KL control cost and its application to crowdsourcing

Y Abbasi-Yadkori, P Bartlett, X Chen… - … on Machine Learning, 2015 - proceedings.mlr.press

Y Abbasi-Yadkori, P Bartlett, X Chen, A Malek

International Conference on Machine Learning, 2015•proceedings.mlr.press

We study average and total cost Markov decision problems with large state spaces. Since
the computational and statistical costs of finding the optimal policy scale with the size of the
state space, we focus on searching for near-optimality in a low-dimensional family of
policies. In particular, we show that for problems with a Kullback-Leibler divergence cost
function, we can reduce policy optimization to a convex optimization and solve it
approximately using a stochastic subgradient algorithm. We show that the performance of …

Abstract

We study average and total cost Markov decision problems with large state spaces. Since the computational and statistical costs of finding the optimal policy scale with the size of the state space, we focus on searching for near-optimality in a low-dimensional family of policies. In particular, we show that for problems with a Kullback-Leibler divergence cost function, we can reduce policy optimization to a convex optimization and solve it approximately using a stochastic subgradient algorithm. We show that the performance of the resulting policy is close to the best in the low-dimensional family. We demonstrate the efficacy of our approach by controlling the important crowdsourcing application of budget allocation in crowd labeling.

proceedings.mlr.press

展开收起

被引用次数：18 相关文章所有 11 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果