随机梯度下降算法研究进展
史加荣, 王丹, 尚凡华, 张鹤于 - 自动化学报, 2021 - aas.net.cn
在机器学习领域中, 梯度下降算法是求解最优化问题最重要, 最基础的方法. 随着数据规模的不断
扩大, 传统的梯度下降算法已不能有效地解决大规模机器学习问题. 随机梯度下降算法在迭代 …
扩大, 传统的梯度下降算法已不能有效地解决大规模机器学习问题. 随机梯度下降算法在迭代 …
Are we there yet? manifold identification of gradient-related proximal methods
In machine learning, models that generalize better often generate outputs that lie on a low-
dimensional manifold. Recently, several works have separately shown finite-time manifold …
dimensional manifold. Recently, several works have separately shown finite-time manifold …
Learning step sizes for unfolded sparse coding
Sparse coding is typically solved by iterative optimization techniques, such as the Iterative
Shrinkage-Thresholding Algorithm (ISTA). Unfolding and learning weights of ISTA using …
Shrinkage-Thresholding Algorithm (ISTA). Unfolding and learning weights of ISTA using …
An accelerated doubly stochastic gradient method with faster explicit model identification
Sparsity regularized loss minimization problems play an important role in various fields
including machine learning, data mining, and modern statistics. Proximal gradient descent …
including machine learning, data mining, and modern statistics. Proximal gradient descent …
Hybrid ISTA: Unfolding ISTA with convergence guarantees using free-form deep neural networks
It is promising to solve linear inverse problems by unfolding iterative algorithms (eg, iterative
shrinkage thresholding algorithm (ISTA)) as deep neural networks (DNNs) with learnable …
shrinkage thresholding algorithm (ISTA)) as deep neural networks (DNNs) with learnable …
Nonsmoothness in machine learning: specific structure, proximal identification, and applications
F Iutzeler, J Malick - Set-Valued and Variational Analysis, 2020 - Springer
Nonsmoothness is often a curse for optimization; but it is sometimes a blessing, in particular
for applications in machine learning. In this paper, we present the specific structure of …
for applications in machine learning. In this paper, we present the specific structure of …
Dual extrapolation for sparse glms
Generalized Linear Models (GLM) form a wide class of regression and classification models,
where prediction is a function of a linear combination of the input variables. For statistical …
where prediction is a function of a linear combination of the input variables. For statistical …
Accelerating inexact successive quadratic approximation for regularized optimization through manifold identification
C Lee - Mathematical Programming, 2023 - Springer
For regularized optimization that minimizes the sum of a smooth term and a regularizer that
promotes structured solutions, inexact proximal-Newton-type methods, or successive …
promotes structured solutions, inexact proximal-Newton-type methods, or successive …
A stochastic extra-step quasi-Newton method for nonsmooth nonconvex optimization
In this paper, a novel stochastic extra-step quasi-Newton method is developed to solve a
class of nonsmooth nonconvex composite optimization problems. We assume that the …
class of nonsmooth nonconvex composite optimization problems. We assume that the …
SAGA with arbitrary sampling
We study the problem of minimizing the average of a very large number of smooth functions,
which is of key importance in training supervised learning models. One of the most …
which is of key importance in training supervised learning models. One of the most …