Online learning: A comprehensive survey
Online learning represents a family of machine learning methods, where a learner attempts
to tackle some predictive (or any type of decision-making) task by learning from a sequence …
to tackle some predictive (or any type of decision-making) task by learning from a sequence …
A tutorial on thompson sampling
Thompson sampling is an algorithm for online decision problems where actions are taken
sequentially in a manner that must balance between exploiting what is known to maximize …
sequentially in a manner that must balance between exploiting what is known to maximize …
Is pessimism provably efficient for offline rl?
We study offline reinforcement learning (RL), which aims to learn an optimal policy based on
a dataset collected a priori. Due to the lack of further interactions with the environment …
a dataset collected a priori. Due to the lack of further interactions with the environment …
[图书][B] Bandit algorithms
T Lattimore, C Szepesvári - 2020 - books.google.com
Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …
and the multi-armed bandit model is a commonly used framework to address it. This …
Derivative-free optimization methods
In many optimization problems arising from scientific, engineering and artificial intelligence
applications, objective and constraint functions are available only as the output of a black …
applications, objective and constraint functions are available only as the output of a black …
Introduction to multi-armed bandits
A Slivkins - Foundations and Trends® in Machine Learning, 2019 - nowpublishers.com
Multi-armed bandits a simple but very powerful framework for algorithms that make
decisions over time under uncertainty. An enormous body of work has accumulated over the …
decisions over time under uncertainty. An enormous body of work has accumulated over the …
Functional variational Bayesian neural networks
Variational Bayesian neural networks (BNNs) perform variational inference over weights, but
it is difficult to specify meaningful priors and approximate posteriors in a high-dimensional …
it is difficult to specify meaningful priors and approximate posteriors in a high-dimensional …
Bayesian reinforcement learning: A survey
Bayesian methods for machine learning have been widely investigated, yielding principled
methods for incorporating prior information into inference algorithms. In this survey, we …
methods for incorporating prior information into inference algorithms. In this survey, we …
Parallelised Bayesian optimisation via Thompson sampling
K Kandasamy, A Krishnamurthy… - International …, 2018 - proceedings.mlr.press
We design and analyse variations of the classical Thompson sampling (TS) procedure for
Bayesian optimisation (BO) in settings where function evaluations are expensive but can be …
Bayesian optimisation (BO) in settings where function evaluations are expensive but can be …
On information gain and regret bounds in gaussian process bandits
Consider the sequential optimization of an expensive to evaluate and possibly non-convex
objective function $ f $ from noisy feedback, that can be considered as a continuum-armed …
objective function $ f $ from noisy feedback, that can be considered as a continuum-armed …