Non-asymptotic analysis of monte carlo tree search
In this work, we consider the popular tree-based search strategy within the framework of
reinforcement learning, the Monte Carlo Tree Search (MCTS), in the context of infinite …
reinforcement learning, the Monte Carlo Tree Search (MCTS), in the context of infinite …
A provably efficient sample collection strategy for reinforcement learning
One of the challenges in online reinforcement learning (RL) is that the agent needs to trade
off the exploration of the environment and the exploitation of the samples to optimize its …
off the exploration of the environment and the exploitation of the samples to optimize its …
Planning in entropy-regularized Markov decision processes and games
JB Grill, O Darwiche Domingues… - Advances in …, 2019 - proceedings.neurips.cc
We propose SmoothCruiser, a new planning algorithm for estimating the value function in
entropy-regularized Markov decision processes and two-player games, given a generative …
entropy-regularized Markov decision processes and two-player games, given a generative …
Goal-oriented exploration for reinforcement learning
J Tarbouriech - 2022 - theses.hal.science
Learning to reach goals is a competence of high practical relevance to acquire for intelligent
agents. For instance, this encompasses many navigation tasks (" go to target X"), robotic …
agents. For instance, this encompasses many navigation tasks (" go to target X"), robotic …
Exploration in Reinforcement Learning: Beyond Finite State-Spaces
OD Domingues - 2022 - theses.hal.science
Reinforcement learning (RL) is a powerful machine learning framework to design algorithms
that learn to make decisions and to interact with the world. Algorithms for RL can be …
that learn to make decisions and to interact with the world. Algorithms for RL can be …
sujet2023-recherche-optimisation Algorithmes de recherche pour l'optimisation
B Scherrer, EBSF Colas, C Bureau - team.inria.fr
Ce sujet concerne l'optimisation de l'utilisation des ressources numériques telles que la
mémoire et le temps CPU pour améliorer les performances liées à la résolution de …
mémoire et le temps CPU pour améliorer les performances liées à la résolution de …
Data Efficient Reinforcement Learning
Z Xu - 2021 - dspace.mit.edu
Reinforcement learning (RL) has recently emerged as a generic yet powerful solution for
learning complex decision-making policies, providing the key foundational underpinnings of …
learning complex decision-making policies, providing the key foundational underpinnings of …