[HTML][HTML] Recent advances in stochastic gradient descent in deep learning
In the age of artificial intelligence, the best approach to handling huge amounts of data is a
tremendously motivating and hard problem. Among machine learning models, stochastic …
tremendously motivating and hard problem. Among machine learning models, stochastic …
Variance-reduced methods for machine learning
Stochastic optimization lies at the heart of machine learning, and its cornerstone is
stochastic gradient descent (SGD), a method introduced over 60 years ago. The last eight …
stochastic gradient descent (SGD), a method introduced over 60 years ago. The last eight …
A survey of optimization methods from a machine learning perspective
Machine learning develops rapidly, which has made many theoretical breakthroughs and is
widely applied in various fields. Optimization, as an important part of machine learning, has …
widely applied in various fields. Optimization, as an important part of machine learning, has …
Federated optimization: Distributed machine learning for on-device intelligence
We introduce a new and increasingly relevant setting for distributed optimization in machine
learning, where the data defining the optimization are unevenly distributed over an …
learning, where the data defining the optimization are unevenly distributed over an …
An improved analysis of (variance-reduced) policy gradient and natural policy gradient methods
In this paper, we revisit and improve the convergence of policy gradient (PG), natural PG
(NPG) methods, and their variance-reduced variants, under general smooth policy …
(NPG) methods, and their variance-reduced variants, under general smooth policy …
Momentum and stochastic momentum for stochastic gradient, newton, proximal point and subspace descent methods
N Loizou, P Richtárik - Computational Optimization and Applications, 2020 - Springer
In this paper we study several classes of stochastic optimization algorithms enriched with
heavy ball momentum. Among the methods studied are: stochastic gradient descent …
heavy ball momentum. Among the methods studied are: stochastic gradient descent …
Don't jump through hoops and remove those loops: SVRG and Katyusha are better without the outer loop
The stochastic variance-reduced gradient method (SVRG) and its accelerated variant
(Katyusha) have attracted enormous attention in the machine learning community in the last …
(Katyusha) have attracted enormous attention in the machine learning community in the last …
Distributed optimization with arbitrary local solvers
With the growth of data and necessity for distributed optimization methods, solvers that work
well on a single machine must be re-designed to leverage distributed computation. Recent …
well on a single machine must be re-designed to leverage distributed computation. Recent …
Linear convergence of natural policy gradient methods with log-linear policies
We consider infinite-horizon discounted Markov decision processes and study the
convergence rates of the natural policy gradient (NPG) and the Q-NPG methods with the log …
convergence rates of the natural policy gradient (NPG) and the Q-NPG methods with the log …
Stochastic quasi-Newton methods for nonconvex stochastic optimization
In this paper we study stochastic quasi-Newton methods for nonconvex stochastic
optimization, where we assume that noisy information about the gradients of the objective …
optimization, where we assume that noisy information about the gradients of the objective …