Reinforcement learning, bit by bit

X Lu, B Van Roy, V Dwaracherla… - … and Trends® in …, 2023 - nowpublishers.com
Reinforcement learning agents have demonstrated remarkable achievements in simulated
environments. Data efficiency poses an impediment to carrying this success over to real …

Provable and practical: Efficient exploration in reinforcement learning via langevin monte carlo

H Ishfaq, Q Lan, P Xu, AR Mahmood, D Precup… - arXiv preprint arXiv …, 2023 - arxiv.org
We present a scalable and effective exploration strategy based on Thompson sampling for
reinforcement learning (RL). One of the key shortcomings of existing Thompson sampling …

Ensembles for uncertainty estimation: Benefits of prior functions and bootstrapping

V Dwaracherla, Z Wen, I Osband, X Lu… - arXiv preprint arXiv …, 2022 - arxiv.org
In machine learning, an agent needs to estimate uncertainty to efficiently explore and adapt
and to make effective decisions. A common approach to uncertainty estimation maintains an …

Sequential Decision Making with Expert Demonstrations under Unobserved Heterogeneity

V Balazadeh, K Chidambaram, V Nguyen… - arXiv preprint arXiv …, 2024 - arxiv.org
We study the problem of online sequential decision-making given auxiliary demonstrations
from experts who made their decisions based on unobserved contextual information. These …

[PDF][PDF] Bayesian Model-Free Deep Reinforcement Learning

PR van der Vaart - Proceedings of the 23rd International Conference on …, 2024 - ifaamas.org
Exploration in reinforcement learning remains a difficult challenge. In order to drive
exploration, ensembles with randomized prior functions have recently been popularized to …

Bayesian Ensembles for Exploration in Deep Q-Learning

P Van der Vaart, N Yorke-Smith… - The Sixteenth Workshop …, 2024 - openreview.net
Exploration in reinforcement learning remains a difficult challenge. In order to drive
exploration, ensembles with randomized prior functions have recently been popularized to …