Biased stochastic conjugate gradient algorithm with adaptive step size for nonconvex problems

R Huang, Y Qin, K Liu, G Yuan - Expert Systems with Applications, 2024 - Elsevier
Conjugate gradient (CG) algorithms are widely applied to machine learning problems owing
to their low calculation cost compared with second-order methods and better convergence …

Spider: Near-optimal non-convex optimization via stochastic path-integrated differential estimator

C Fang, CJ Li, Z Lin, T Zhang - Advances in neural …, 2018 - proceedings.neurips.cc
In this paper, we propose a new technique named\textit {Stochastic Path-Integrated
Differential EstimatoR}(SPIDER), which can be used to track many deterministic quantities of …

Momentum-based variance reduction in non-convex sgd

A Cutkosky, F Orabona - Advances in neural information …, 2019 - proceedings.neurips.cc
Variance reduction has emerged in recent years as a strong competitor to stochastic
gradient descent in non-convex problems, providing the first algorithms to improve upon the …

SARAH: A novel method for machine learning problems using stochastic recursive gradient

LM Nguyen, J Liu, K Scheinberg… - … conference on machine …, 2017 - proceedings.mlr.press
In this paper, we propose a StochAstic Recursive grAdient algoritHm (SARAH), as well as its
practical variant SARAH+, as a novel approach to the finite-sum minimization problems …

Breaking the centralized barrier for cross-device federated learning

SP Karimireddy, M Jaggi, S Kale… - Advances in …, 2021 - proceedings.neurips.cc
Federated learning (FL) is a challenging setting for optimization due to the heterogeneity of
the data across different clients which gives rise to the client drift phenomenon. In fact …

Stochastic nested variance reduction for nonconvex optimization

D Zhou, P Xu, Q Gu - Journal of machine learning research, 2020 - jmlr.org
We study nonconvex optimization problems, where the objective function is either an
average of n nonconvex functions or the expectation of some stochastic function. We …

Spiderboost and momentum: Faster variance reduction algorithms

Z Wang, K Ji, Y Zhou, Y Liang… - Advances in Neural …, 2019 - proceedings.neurips.cc
SARAH and SPIDER are two recently developed stochastic variance-reduced algorithms,
and SPIDER has been shown to achieve a near-optimal first-order oracle complexity in …

ProxSARAH: An efficient algorithmic framework for stochastic composite nonconvex optimization

NH Pham, LM Nguyen, DT Phan… - Journal of Machine …, 2020 - jmlr.org
We propose a new stochastic first-order algorithmic framework to solve stochastic composite
nonconvex optimization problems that covers both finite-sum and expectation settings. Our …

Recent theoretical advances in non-convex optimization

M Danilova, P Dvurechensky, A Gasnikov… - … and Probability: With a …, 2022 - Springer
Motivated by recent increased interest in optimization algorithms for non-convex
optimization in application to training deep neural networks and other optimization problems …

A multi-batch L-BFGS method for machine learning

AS Berahas, J Nocedal… - Advances in Neural …, 2016 - proceedings.neurips.cc
The question of how to parallelize the stochastic gradient descent (SGD) method has
received much attention in the literature. In this paper, we focus instead on batch methods …