Stochastic block BFGS: Squeezing more curvature out of data

Y Tian, Y Zhang, H Zhang - Mathematics, 2023 - mdpi.com

In the age of artificial intelligence, the best approach to handling huge amounts of data is a
tremendously motivating and hard problem. Among machine learning models, stochastic …

被引用次数：59 相关文章所有 5 个版本

[PDF] arxiv.org

Variance-reduced methods for machine learning

RM Gower, M Schmidt, F Bach… - Proceedings of the …, 2020 - ieeexplore.ieee.org

Stochastic optimization lies at the heart of machine learning, and its cornerstone is
stochastic gradient descent (SGD), a method introduced over 60 years ago. The last eight …

被引用次数：117 相关文章所有 14 个版本

[PDF] arxiv.org

A survey of optimization methods from a machine learning perspective

S Sun, Z Cao, H Zhu, J Zhao - IEEE transactions on cybernetics, 2019 - ieeexplore.ieee.org

Machine learning develops rapidly, which has made many theoretical breakthroughs and is
widely applied in various fields. Optimization, as an important part of machine learning, has …

被引用次数：753 相关文章所有 9 个版本

[PDF] arxiv.org

Federated optimization: Distributed machine learning for on-device intelligence

J Konečný, HB McMahan, D Ramage… - arXiv preprint arXiv …, 2016 - arxiv.org

We introduce a new and increasingly relevant setting for distributed optimization in machine
learning, where the data defining the optimization are unevenly distributed over an …

被引用次数：2099 相关文章所有 8 个版本

[PDF] neurips.cc

An improved analysis of (variance-reduced) policy gradient and natural policy gradient methods

Y Liu, K Zhang, T Basar, W Yin - Advances in Neural …, 2020 - proceedings.neurips.cc

In this paper, we revisit and improve the convergence of policy gradient (PG), natural PG
(NPG) methods, and their variance-reduced variants, under general smooth policy …

被引用次数：108 相关文章所有 8 个版本

[PDF] arxiv.org

Momentum and stochastic momentum for stochastic gradient, newton, proximal point and subspace descent methods

N Loizou, P Richtárik - Computational Optimization and Applications, 2020 - Springer

In this paper we study several classes of stochastic optimization algorithms enriched with
heavy ball momentum. Among the methods studied are: stochastic gradient descent …

被引用次数：214 相关文章所有 17 个版本

[PDF] mlr.press

Don't jump through hoops and remove those loops: SVRG and Katyusha are better without the outer loop

D Kovalev, S Horváth… - Algorithmic Learning …, 2020 - proceedings.mlr.press

The stochastic variance-reduced gradient method (SVRG) and its accelerated variant
(Katyusha) have attracted enormous attention in the machine learning community in the last …

被引用次数：171 相关文章所有 7 个版本

[PDF] tandfonline.com

Distributed optimization with arbitrary local solvers

C Ma, J Konečný, M Jaggi, V Smith… - optimization Methods …, 2017 - Taylor & Francis

With the growth of data and necessity for distributed optimization methods, solvers that work
well on a single machine must be re-designed to leverage distributed computation. Recent …

被引用次数：216 相关文章所有 17 个版本

[PDF] nsf.gov

Linear convergence of natural policy gradient methods with log-linear policies

R Yuan, SS Du, RM Gower, A Lazaric… - … Conference on Learning …, 2023 - par.nsf.gov

We consider infinite-horizon discounted Markov decision processes and study the
convergence rates of the natural policy gradient (NPG) and the Q-NPG methods with the log …

被引用次数：31 相关文章所有 8 个版本

[PDF] arxiv.org

Stochastic quasi-Newton methods for nonconvex stochastic optimization

X Wang, S Ma, D Goldfarb, W Liu - SIAM Journal on Optimization, 2017 - SIAM

In this paper we study stochastic quasi-Newton methods for nonconvex stochastic
optimization, where we assume that noisy information about the gradients of the objective …

被引用次数：211 相关文章所有 14 个版本