Online learning: A comprehensive survey

SCH Hoi, D Sahoo, J Lu, P Zhao - Neurocomputing, 2021 - Elsevier
Online learning represents a family of machine learning methods, where a learner attempts
to tackle some predictive (or any type of decision-making) task by learning from a sequence …

A tutorial on thompson sampling

DJ Russo, B Van Roy, A Kazerouni… - … and Trends® in …, 2018 - nowpublishers.com
Thompson sampling is an algorithm for online decision problems where actions are taken
sequentially in a manner that must balance between exploiting what is known to maximize …

Is pessimism provably efficient for offline rl?

Y Jin, Z Yang, Z Wang - International Conference on …, 2021 - proceedings.mlr.press
We study offline reinforcement learning (RL), which aims to learn an optimal policy based on
a dataset collected a priori. Due to the lack of further interactions with the environment …

[图书][B] Bandit algorithms

T Lattimore, C Szepesvári - 2020 - books.google.com
Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

Derivative-free optimization methods

J Larson, M Menickelly, SM Wild - Acta Numerica, 2019 - cambridge.org
In many optimization problems arising from scientific, engineering and artificial intelligence
applications, objective and constraint functions are available only as the output of a black …

Introduction to multi-armed bandits

A Slivkins - Foundations and Trends® in Machine Learning, 2019 - nowpublishers.com
Multi-armed bandits a simple but very powerful framework for algorithms that make
decisions over time under uncertainty. An enormous body of work has accumulated over the …

Functional variational Bayesian neural networks

S Sun, G Zhang, J Shi, R Grosse - arXiv preprint arXiv:1903.05779, 2019 - arxiv.org
Variational Bayesian neural networks (BNNs) perform variational inference over weights, but
it is difficult to specify meaningful priors and approximate posteriors in a high-dimensional …

Bayesian reinforcement learning: A survey

M Ghavamzadeh, S Mannor, J Pineau… - … and Trends® in …, 2015 - nowpublishers.com
Bayesian methods for machine learning have been widely investigated, yielding principled
methods for incorporating prior information into inference algorithms. In this survey, we …

Parallelised Bayesian optimisation via Thompson sampling

K Kandasamy, A Krishnamurthy… - International …, 2018 - proceedings.mlr.press
We design and analyse variations of the classical Thompson sampling (TS) procedure for
Bayesian optimisation (BO) in settings where function evaluations are expensive but can be …

On information gain and regret bounds in gaussian process bandits

S Vakili, K Khezeli, V Picheny - International Conference on …, 2021 - proceedings.mlr.press
Consider the sequential optimization of an expensive to evaluate and possibly non-convex
objective function $ f $ from noisy feedback, that can be considered as a continuum-armed …