[PDF][PDF] Adapting to a Changing Environment: the Brownian Restless Bandits.
A Slivkins, E Upfal - COLT, 2008 - slivkins.com
In the multi-armed bandit (MAB) problem there are k distributions associated with the
rewards of playing each of k strategies (slot machine arms). The reward distributions are …
rewards of playing each of k strategies (slot machine arms). The reward distributions are …
Approximation algorithms for restless bandit problems
The restless bandit problem is one of the most well-studied generalizations of the celebrated
stochastic multi-armed bandit (MAB) problem in decision theory. In its ultimate generality, the …
stochastic multi-armed bandit (MAB) problem in decision theory. In its ultimate generality, the …
Multi-UAV dynamic routing with partial observations using restless bandit allocation indices
Motivated by the type of missions currently performed by unmanned aerial vehicles, we
investigate a discrete dynamic vehicle routing problem with a potentially large number of …
investigate a discrete dynamic vehicle routing problem with a potentially large number of …
Approximation algorithms for correlated knapsacks and non-martingale bandits
A Gupta, R Krishnaswamy… - 2011 IEEE 52nd …, 2011 - ieeexplore.ieee.org
In the stochastic knapsack problem, we are given a knapsack of size B, and a set of items
whose sizes and rewards are drawn from a known probability distribution. To know the …
whose sizes and rewards are drawn from a known probability distribution. To know the …
Performance optimization for unmanned vehicle systems
J Le Ny - 2008 - dspace.mit.edu
Technological advances in the area of unmanned vehicles are opening new possibilities for
creating teams of vehicles performing complex missions with some degree of autonomy …
creating teams of vehicles performing complex missions with some degree of autonomy …
Stratégies optimistes en apprentissage par renforcement
S Filippi - 2010 - theses.hal.science
Cette thèse traite de méthodes «model-based» pour résoudre des problèmes
d'apprentissage par renforcement. On considère un agent confronté à une suite de …
d'apprentissage par renforcement. On considère un agent confronté à une suite de …
[PDF][PDF] Sarah Filippi
R Munos, F Garcia, E Moulines, F Clérot - theses.hal.science
Cette thèse a été effectuée au sein du Laboratoire Traitement et Communication de
l'Information (LTCI), une Unité Mixte de Recherche du CNRS et de Télécom ParisTech. Elle …
l'Information (LTCI), une Unité Mixte de Recherche du CNRS et de Télécom ParisTech. Elle …