[HTML][HTML] Recent advances in stochastic gradient descent in deep learning

Y Tian, Y Zhang, H Zhang - Mathematics, 2023 - mdpi.com
In the age of artificial intelligence, the best approach to handling huge amounts of data is a
tremendously motivating and hard problem. Among machine learning models, stochastic …

Data-driven aerospace engineering: reframing the industry with machine learning

SL Brunton, J Nathan Kutz, K Manohar, AY Aravkin… - AIAA Journal, 2021 - arc.aiaa.org
Data science, and machine learning in particular, is rapidly transforming the scientific and
industrial landscapes. The aerospace industry is poised to capitalize on big data and …

随机梯度下降算法研究进展

史加荣, 王丹, 尚凡华, 张鹤于 - 自动化学报, 2021 - aas.net.cn
在机器学习领域中, 梯度下降算法是求解最优化问题最重要, 最基础的方法. 随着数据规模的不断
扩大, 传统的梯度下降算法已不能有效地解决大规模机器学习问题. 随机梯度下降算法在迭代 …

Linear convergence of gradient and proximal-gradient methods under the polyak-łojasiewicz condition

H Karimi, J Nutini, M Schmidt - … Conference, ECML PKDD 2016, Riva del …, 2016 - Springer
In 1963, Polyak proposed a simple condition that is sufficient to show a global linear
convergence rate for gradient descent. This condition is a special case of the Łojasiewicz …

Stochastic variance reduction for nonconvex optimization

SJ Reddi, A Hefny, S Sra, B Poczos… - … on machine learning, 2016 - proceedings.mlr.press
We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient
(SVRG) methods for them. SVRG and related methods have recently surged into …

Global optimality guarantees for policy gradient methods

J Bhandari, D Russo - Operations Research, 2024 - pubsonline.informs.org
Policy gradients methods apply to complex, poorly understood, control problems by
performing stochastic gradient descent over a parameterized class of polices. Unfortunately …

A primer on zeroth-order optimization in signal processing and machine learning: Principals, recent advances, and applications

S Liu, PY Chen, B Kailkhura, G Zhang… - IEEE Signal …, 2020 - ieeexplore.ieee.org
Zeroth-order (ZO) optimization is a subset of gradient-free optimization that emerges in many
signal processing and machine learning (ML) applications. It is used for solving optimization …

Stochastic model-based minimization of weakly convex functions

D Davis, D Drusvyatskiy - SIAM Journal on Optimization, 2019 - SIAM
We consider a family of algorithms that successively sample and minimize simple stochastic
models of the objective function. We show that under reasonable conditions on …

Stochastic nested variance reduction for nonconvex optimization

D Zhou, P Xu, Q Gu - Journal of machine learning research, 2020 - jmlr.org
We study nonconvex optimization problems, where the objective function is either an
average of n nonconvex functions or the expectation of some stochastic function. We …

Spiderboost and momentum: Faster variance reduction algorithms

Z Wang, K Ji, Y Zhou, Y Liang… - Advances in Neural …, 2019 - proceedings.neurips.cc
SARAH and SPIDER are two recently developed stochastic variance-reduced algorithms,
and SPIDER has been shown to achieve a near-optimal first-order oracle complexity in …