On index policies for restless bandit problems

A Slivkins, E Upfal - COLT, 2008 - slivkins.com

In the multi-armed bandit (MAB) problem there are k distributions associated with the
rewards of playing each of k strategies (slot machine arms). The reward distributions are …

被引用次数：167 相关文章所有 11 个版本

[PDF] siam.org

Approximation algorithms for restless bandit problems

S Guha, K Munagala, P Shi - Journal of the ACM (JACM), 2010 - dl.acm.org

The restless bandit problem is one of the most well-studied generalizations of the celebrated
stochastic multi-armed bandit (MAB) problem in decision theory. In its ultimate generality, the …

被引用次数：161 相关文章所有 12 个版本

[PDF] polymtl.ca

Multi-UAV dynamic routing with partial observations using restless bandit allocation indices

J Le Ny, M Dahleh, E Feron - 2008 American Control …, 2008 - ieeexplore.ieee.org

Motivated by the type of missions currently performed by unmanned aerial vehicles, we
investigate a discrete dynamic vehicle routing problem with a potentially large number of …

被引用次数：139 相关文章所有 19 个版本

[PDF] arxiv.org

Approximation algorithms for correlated knapsacks and non-martingale bandits

A Gupta, R Krishnaswamy… - 2011 IEEE 52nd …, 2011 - ieeexplore.ieee.org

In the stochastic knapsack problem, we are given a knapsack of size B, and a set of items
whose sizes and rewards are drawn from a known probability distribution. To know the …

被引用次数：93 相关文章所有 20 个版本

[PDF] mit.edu

Performance optimization for unmanned vehicle systems

J Le Ny - 2008 - dspace.mit.edu

Technological advances in the area of unmanned vehicles are opening new possibilities for
creating teams of vehicles performing complex missions with some degree of autonomy …

被引用次数：20 相关文章所有 7 个版本

[PDF] hal.science

Stratégies optimistes en apprentissage par renforcement

S Filippi - 2010 - theses.hal.science

Cette thèse traite de méthodes «model-based» pour résoudre des problèmes
d'apprentissage par renforcement. On considère un agent confronté à une suite de …

被引用次数：8 相关文章所有 3 个版本

[PDF] hal.science

[PDF][PDF] Sarah Filippi

R Munos, F Garcia, E Moulines, F Clérot - theses.hal.science

Cette thèse a été effectuée au sein du Laboratoire Traitement et Communication de
l'Information (LTCI), une Unité Mixte de Recherche du CNRS et de Télécom ParisTech. Elle …