Advances in variational inference

C Zhang, J Bütepage, H Kjellström… - IEEE transactions on …, 2018 - ieeexplore.ieee.org
Many modern unsupervised or semi-supervised machine learning algorithms rely on
Bayesian probabilistic models. These models are usually intractable and thus require …

Kernel mean embedding of distributions: A review and beyond

K Muandet, K Fukumizu… - … and Trends® in …, 2017 - nowpublishers.com
A Hilbert space embedding of a distribution—in short, a kernel mean embedding—has
recently emerged as a powerful tool for machine learning and statistical inference. The basic …

Monte carlo gradient estimation in machine learning

S Mohamed, M Rosca, M Figurnov, A Mnih - Journal of Machine Learning …, 2020 - jmlr.org
This paper is a broad and accessible survey of the methods we have at our disposal for
Monte Carlo gradient estimation in machine learning and across the statistical sciences: the …

Virtual adversarial training: a regularization method for supervised and semi-supervised learning

T Miyato, S Maeda, M Koyama… - IEEE transactions on …, 2018 - ieeexplore.ieee.org
We propose a new regularization method based on virtual adversarial loss: a new measure
of local smoothness of the conditional label distribution given input. Virtual adversarial loss …

Stein variational gradient descent: A general purpose bayesian inference algorithm

Q Liu, D Wang - Advances in neural information processing …, 2016 - proceedings.neurips.cc
We propose a general purpose variational inference algorithm that forms a natural
counterpart of gradient descent for optimization. Our method iteratively transports a set of …

Gaussian processes and kernel methods: A review on connections and equivalences

M Kanagawa, P Hennig, D Sejdinovic… - arXiv preprint arXiv …, 2018 - arxiv.org
This paper is an attempt to bridge the conceptual gaps between researchers working on the
two widely used approaches based on positive definite kernels: Bayesian learning or …

A kernelized Stein discrepancy for goodness-of-fit tests

Q Liu, J Lee, M Jordan - International conference on …, 2016 - proceedings.mlr.press
We derive a new discrepancy statistic for measuring differences between two probability
distributions based on combining Stein's identity and the reproducing kernel Hilbert space …

A conceptual introduction to Hamiltonian Monte Carlo

M Betancourt - arXiv preprint arXiv:1701.02434, 2017 - arxiv.org
Hamiltonian Monte Carlo has proven a remarkable empirical success, but only recently have
we begun to develop a rigorous understanding of why it performs so well on difficult …

Backpropagation through the void: Optimizing control variates for black-box gradient estimation

W Grathwohl, D Choi, Y Wu, G Roeder… - arXiv preprint arXiv …, 2017 - arxiv.org
Gradient-based optimization is the foundation of deep learning and reinforcement learning.
Even when the mechanism being optimized is unknown or not differentiable, optimization …

A survey of Monte Carlo methods for parameter estimation

D Luengo, L Martino, M Bugallo, V Elvira… - EURASIP Journal on …, 2020 - Springer
Statistical signal processing applications usually require the estimation of some parameters
of interest given a set of observed data. These estimates are typically obtained either by …