Large-scale Markov decision problems with KL control cost and its application to crowdsourcing
Y Abbasi-Yadkori, P Bartlett, X Chen… - … on Machine Learning, 2015 - proceedings.mlr.press
International Conference on Machine Learning, 2015•proceedings.mlr.press
We study average and total cost Markov decision problems with large state spaces. Since
the computational and statistical costs of finding the optimal policy scale with the size of the
state space, we focus on searching for near-optimality in a low-dimensional family of
policies. In particular, we show that for problems with a Kullback-Leibler divergence cost
function, we can reduce policy optimization to a convex optimization and solve it
approximately using a stochastic subgradient algorithm. We show that the performance of …
the computational and statistical costs of finding the optimal policy scale with the size of the
state space, we focus on searching for near-optimality in a low-dimensional family of
policies. In particular, we show that for problems with a Kullback-Leibler divergence cost
function, we can reduce policy optimization to a convex optimization and solve it
approximately using a stochastic subgradient algorithm. We show that the performance of …
Abstract
We study average and total cost Markov decision problems with large state spaces. Since the computational and statistical costs of finding the optimal policy scale with the size of the state space, we focus on searching for near-optimality in a low-dimensional family of policies. In particular, we show that for problems with a Kullback-Leibler divergence cost function, we can reduce policy optimization to a convex optimization and solve it approximately using a stochastic subgradient algorithm. We show that the performance of the resulting policy is close to the best in the low-dimensional family. We demonstrate the efficacy of our approach by controlling the important crowdsourcing application of budget allocation in crowd labeling.
proceedings.mlr.press
以上显示的是最相近的搜索结果。 查看全部搜索结果