Local convergence properties of SAGA/Prox-SVRG and acceleration

史加荣，王丹，尚凡华，张鹤于 - 自动化学报, 2021 - aas.net.cn

在机器学习领域中, 梯度下降算法是求解最优化问题最重要, 最基础的方法. 随着数据规模的不断
扩大, 传统的梯度下降算法已不能有效地解决大规模机器学习问题. 随机梯度下降算法在迭代 …

Are we there yet? manifold identification of gradient-related proximal methods

Y Sun, H Jeong, J Nutini… - The 22nd International …, 2019 - proceedings.mlr.press

In machine learning, models that generalize better often generate outputs that lie on a low-
dimensional manifold. Recently, several works have separately shown finite-time manifold …

被引用次数：35 相关文章所有 4 个版本

[PDF] neurips.cc

Learning step sizes for unfolded sparse coding

P Ablin, T Moreau, M Massias… - Advances in Neural …, 2019 - proceedings.neurips.cc

Sparse coding is typically solved by iterative optimization techniques, such as the Iterative
Shrinkage-Thresholding Algorithm (ISTA). Unfolding and learning weights of ISTA using …

被引用次数：62 相关文章所有 13 个版本

[PDF] arxiv.org

An accelerated doubly stochastic gradient method with faster explicit model identification

R Bao, B Gu, H Huang - Proceedings of the 31st ACM International …, 2022 - dl.acm.org

Sparsity regularized loss minimization problems play an important role in various fields
including machine learning, data mining, and modern statistics. Proximal gradient descent …

被引用次数：15 相关文章所有 7 个版本

[PDF] arxiv.org

Hybrid ISTA: Unfolding ISTA with convergence guarantees using free-form deep neural networks

Z Zheng, W Dai, D Xue, C Li, J Zou… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

It is promising to solve linear inverse problems by unfolding iterative algorithms (eg, iterative
shrinkage thresholding algorithm (ISTA)) as deep neural networks (DNNs) with learnable …

被引用次数：14 相关文章所有 6 个版本

[PDF] arxiv.org

Nonsmoothness in machine learning: specific structure, proximal identification, and applications

F Iutzeler, J Malick - Set-Valued and Variational Analysis, 2020 - Springer

Nonsmoothness is often a curse for optimization; but it is sometimes a blessing, in particular
for applications in machine learning. In this paper, we present the specific structure of …

被引用次数：20 相关文章所有 10 个版本

[PDF] jmlr.org

Dual extrapolation for sparse glms

M Massias, S Vaiter, A Gramfort, J Salmon - Journal of Machine Learning …, 2020 - jmlr.org

Generalized Linear Models (GLM) form a wide class of regression and classification models,
where prediction is a function of a linear combination of the input variables. For statistical …

被引用次数：26 相关文章所有 14 个版本

[PDF] springer.com

Accelerating inexact successive quadratic approximation for regularized optimization through manifold identification

C Lee - Mathematical Programming, 2023 - Springer

For regularized optimization that minimizes the sum of a smooth term and a regularizer that
promotes structured solutions, inexact proximal-Newton-type methods, or successive …

被引用次数：15 相关文章所有 7 个版本

[PDF] arxiv.org

A stochastic extra-step quasi-Newton method for nonsmooth nonconvex optimization

M Yang, A Milzarek, Z Wen, T Zhang - Mathematical Programming, 2022 - Springer

In this paper, a novel stochastic extra-step quasi-Newton method is developed to solve a
class of nonsmooth nonconvex composite optimization problems. We assume that the …

被引用次数：22 相关文章所有 9 个版本

[PDF] mlr.press

SAGA with arbitrary sampling

X Qian, Z Qu, P Richtárik - International Conference on …, 2019 - proceedings.mlr.press

We study the problem of minimizing the average of a very large number of smooth functions,
which is of key importance in training supervised learning models. One of the most …

被引用次数：26 相关文章所有 12 个版本