Non-convex learning via replica exchange stochastic gradient mcmc

W Deng, Q Feng, L Gao, F Liang… - … Conference on Machine …, 2020 - proceedings.mlr.press
Abstract Replica exchange Monte Carlo (reMC), also known as parallel tempering, is an
important technique for accelerating the convergence of the conventional Markov Chain …

An adaptive empirical Bayesian method for sparse deep learning

W Deng, X Zhang, F Liang… - Advances in neural …, 2019 - proceedings.neurips.cc
We propose a novel adaptive empirical Bayesian (AEB) method for sparse deep learning,
where the sparsity is ensured via a class of self-adaptive spike-and-slab priors. The …

An adaptively weighted stochastic gradient MCMC algorithm for Monte Carlo simulation and global optimization

W Deng, G Lin, F Liang - Statistics and Computing, 2022 - Springer
We propose an adaptively weighted stochastic gradient Langevin dynamics (AWSGLD)
algorithm for Bayesian learning of big data problems. The proposed algorithm is scalable …

A contour stochastic gradient langevin dynamics algorithm for simulations of multi-modal distributions

W Deng, G Lin, F Liang - Advances in neural information …, 2020 - proceedings.neurips.cc
We propose an adaptively weighted stochastic gradient Langevin dynamics algorithm
(SGLD), so-called contour stochastic gradient Langevin dynamics (CSGLD), for Bayesian …

On convergence of federated averaging langevin dynamics

W Deng, Q Zhang, YA Ma, Z Song, G Lin - arXiv preprint arXiv:2112.05120, 2021 - arxiv.org
We propose a federated averaging Langevin algorithm (FA-LD) for uncertainty quantification
and mean predictions with distributed clients. In particular, we generalize beyond normal …

Nonconvex sampling with the Metropolis-adjusted Langevin algorithm

O Mangoubi, NK Vishnoi - Conference on learning theory, 2019 - proceedings.mlr.press
Abstract The Langevin Markov chain algorithms are widely deployed methods to sample
from distributions in challenging high-dimensional and non-convex statistics and machine …

Does Hamiltonian Monte Carlo mix faster than a random walk on multimodal densities?

O Mangoubi, NS Pillai, A Smith - arXiv preprint arXiv:1808.03230, 2018 - arxiv.org
Hamiltonian Monte Carlo (HMC) is a very popular and generic collection of Markov chain
Monte Carlo (MCMC) algorithms. One explanation for the popularity of HMC algorithms is …

Replica exchange for non-convex optimization

J Dong, XT Tong - Journal of Machine Learning Research, 2021 - jmlr.org
Gradient descent (GD) is known to converge quickly for convex objective functions, but it can
be trapped at local minima. On the other hand, Langevin dynamics (LD) can explore the …

Polygonal Unadjusted Langevin Algorithms: Creating stable and efficient adaptive algorithms for neural networks

DY Lim, S Sabanis - Journal of Machine Learning Research, 2024 - jmlr.org
We present a new class of Langevin-based algorithms, which overcomes many of the known
shortcomings of popular adaptive optimizers that are currently used for the fine tuning of …

Approximate optimization of convex functions with outlier noise

A De, S Khanna, H Li… - Advances in neural …, 2021 - proceedings.neurips.cc
We study the problem of minimizing a convex function given by a zeroth order oracle that is
possibly corrupted by {\em outlier noise}. Specifically, we assume the function values at …