随机梯度下降算法研究进展

史加荣, 王丹, 尚凡华, 张鹤于 - 自动化学报, 2021 - aas.net.cn
在机器学习领域中, 梯度下降算法是求解最优化问题最重要, 最基础的方法. 随着数据规模的不断
扩大, 传统的梯度下降算法已不能有效地解决大规模机器学习问题. 随机梯度下降算法在迭代 …

Are we there yet? manifold identification of gradient-related proximal methods

Y Sun, H Jeong, J Nutini… - The 22nd International …, 2019 - proceedings.mlr.press
In machine learning, models that generalize better often generate outputs that lie on a low-
dimensional manifold. Recently, several works have separately shown finite-time manifold …

Learning step sizes for unfolded sparse coding

P Ablin, T Moreau, M Massias… - Advances in Neural …, 2019 - proceedings.neurips.cc
Sparse coding is typically solved by iterative optimization techniques, such as the Iterative
Shrinkage-Thresholding Algorithm (ISTA). Unfolding and learning weights of ISTA using …

An accelerated doubly stochastic gradient method with faster explicit model identification

R Bao, B Gu, H Huang - Proceedings of the 31st ACM International …, 2022 - dl.acm.org
Sparsity regularized loss minimization problems play an important role in various fields
including machine learning, data mining, and modern statistics. Proximal gradient descent …

Hybrid ISTA: Unfolding ISTA with convergence guarantees using free-form deep neural networks

Z Zheng, W Dai, D Xue, C Li, J Zou… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
It is promising to solve linear inverse problems by unfolding iterative algorithms (eg, iterative
shrinkage thresholding algorithm (ISTA)) as deep neural networks (DNNs) with learnable …

Nonsmoothness in machine learning: specific structure, proximal identification, and applications

F Iutzeler, J Malick - Set-Valued and Variational Analysis, 2020 - Springer
Nonsmoothness is often a curse for optimization; but it is sometimes a blessing, in particular
for applications in machine learning. In this paper, we present the specific structure of …

Dual extrapolation for sparse glms

M Massias, S Vaiter, A Gramfort, J Salmon - Journal of Machine Learning …, 2020 - jmlr.org
Generalized Linear Models (GLM) form a wide class of regression and classification models,
where prediction is a function of a linear combination of the input variables. For statistical …

Accelerating inexact successive quadratic approximation for regularized optimization through manifold identification

C Lee - Mathematical Programming, 2023 - Springer
For regularized optimization that minimizes the sum of a smooth term and a regularizer that
promotes structured solutions, inexact proximal-Newton-type methods, or successive …

A stochastic extra-step quasi-Newton method for nonsmooth nonconvex optimization

M Yang, A Milzarek, Z Wen, T Zhang - Mathematical Programming, 2022 - Springer
In this paper, a novel stochastic extra-step quasi-Newton method is developed to solve a
class of nonsmooth nonconvex composite optimization problems. We assume that the …

SAGA with arbitrary sampling

X Qian, Z Qu, P Richtárik - International Conference on …, 2019 - proceedings.mlr.press
We study the problem of minimizing the average of a very large number of smooth functions,
which is of key importance in training supervised learning models. One of the most …