Training structured neural networks through manifold identification and variance reduction

ZS Huang, C Lee - arXiv preprint arXiv:2112.02612, 2021 - arxiv.org
This paper proposes an algorithm (RMDA) for training neural networks (NNs) with a
regularization term for promoting desired structures. RMDA does not incur computation …

Newton acceleration on manifolds identified by proximal gradient methods

G Bareilles, F Iutzeler, J Malick - Mathematical Programming, 2023 - Springer
Proximal methods are known to identify the underlying substructure of nonsmooth
optimization problems. Even more, in many interesting situations, the output of a proximity …

Precoder Design for User-Centric Network Massive MIMO with Matrix Manifold Optimization

R Sun, L You, AA Lu, C Sun, X Gao, XG Xia - arXiv preprint arXiv …, 2024 - arxiv.org
In this paper, we investigate the precoder design for user-centric network (UCN) massive
multiple-input multiple-output (mMIMO) downlink with matrix manifold optimization. In UCN …

A proximal-gradient method for problems with overlapping group-sparse regularization: support identification complexity

Y Dai, DP Robinson - Optimization Methods and Software, 2024 - Taylor & Francis
We consider the proximal-gradient method for minimizing the sum of a smooth function and
a convex non-smooth overlapping group-ℓ 1 regularizer, which is known to promote sparse …

Sampling-based methods for multi-block optimization problems over transport polytopes

Y Hu, M Li, X Liu, C Meng - Mathematics of Computation, 2024 - ams.org
This paper focuses on multi-block optimization problems over transport polytopes, which
underlie various applications including strongly correlated quantum physics and machine …

A Stochastic Block-coordinate Proximal Newton Method for Nonconvex Composite Minimization

H Zhu, X Qian - arXiv preprint arXiv:2412.18394, 2024 - arxiv.org
We propose a stochastic block-coordinate proximal Newton method for minimizing the sum
of a blockwise Lipschitz-continuously differentiable function and a separable nonsmooth …

Accelerated projected gradient algorithms for sparsity constrained optimization problems

JH Alcantara, C Lee - Advances in Neural Information …, 2022 - proceedings.neurips.cc
We consider the projected gradient algorithm for the nonconvex best subset selection
problem that minimizes a given empirical loss function under an $\ell_0 $-norm constraint …

Sampling-Based Approaches for Multimarginal Optimal Transport Problems with Coulomb Cost

Y Hu, M Li, X Liu, C Meng - arXiv preprint arXiv:2306.16763, 2023 - arxiv.org
The multimarginal optimal transport problem with Coulomb cost arises in quantum physics
and is vital in understanding strongly correlated quantum systems. Its intrinsic curse of …

Inexact proximal-gradient methods with support identification

Y Dai, DP Robinson - arXiv preprint arXiv:2211.02214, 2022 - arxiv.org
We consider the proximal-gradient method for minimizing an objective function that is the
sum of a smooth function and a non-smooth convex function. A feature that distinguishes our …

Regularized Adaptive Momentum Dual Averaging with an Efficient Inexact Subproblem Solver for Training Structured Neural Network

ZS Huang, C Lee - arXiv preprint arXiv:2403.14398, 2024 - arxiv.org
We propose a Regularized Adaptive Momentum Dual Averaging (RAMDA) algorithm for
training structured neural networks. Similar to existing regularized adaptive methods, the …