[PDF][PDF] Adapting to a Changing Environment: the Brownian Restless Bandits.

A Slivkins, E Upfal - COLT, 2008 - slivkins.com
In the multi-armed bandit (MAB) problem there are k distributions associated with the
rewards of playing each of k strategies (slot machine arms). The reward distributions are …

Approximation algorithms for restless bandit problems

S Guha, K Munagala, P Shi - Journal of the ACM (JACM), 2010 - dl.acm.org
The restless bandit problem is one of the most well-studied generalizations of the celebrated
stochastic multi-armed bandit (MAB) problem in decision theory. In its ultimate generality, the …

Multi-UAV dynamic routing with partial observations using restless bandit allocation indices

J Le Ny, M Dahleh, E Feron - 2008 American Control …, 2008 - ieeexplore.ieee.org
Motivated by the type of missions currently performed by unmanned aerial vehicles, we
investigate a discrete dynamic vehicle routing problem with a potentially large number of …

Approximation algorithms for correlated knapsacks and non-martingale bandits

A Gupta, R Krishnaswamy… - 2011 IEEE 52nd …, 2011 - ieeexplore.ieee.org
In the stochastic knapsack problem, we are given a knapsack of size B, and a set of items
whose sizes and rewards are drawn from a known probability distribution. To know the …

Performance optimization for unmanned vehicle systems

J Le Ny - 2008 - dspace.mit.edu
Technological advances in the area of unmanned vehicles are opening new possibilities for
creating teams of vehicles performing complex missions with some degree of autonomy …

Stratégies optimistes en apprentissage par renforcement

S Filippi - 2010 - theses.hal.science
Cette thèse traite de méthodes «model-based» pour résoudre des problèmes
d'apprentissage par renforcement. On considère un agent confronté à une suite de …

[PDF][PDF] Sarah Filippi

R Munos, F Garcia, E Moulines, F Clérot - theses.hal.science
Cette thèse a été effectuée au sein du Laboratoire Traitement et Communication de
l'Information (LTCI), une Unité Mixte de Recherche du CNRS et de Télécom ParisTech. Elle …