Stochastic nested variance reduction for nonconvex optimization
We study nonconvex optimization problems, where the objective function is either an
average of n nonconvex functions or the expectation of some stochastic function. We …
average of n nonconvex functions or the expectation of some stochastic function. We …
Convex optimization algorithms in medical image reconstruction—in the age of AI
The past decade has seen the rapid growth of model based image reconstruction (MBIR)
algorithms, which are often applications or adaptations of convex optimization algorithms …
algorithms, which are often applications or adaptations of convex optimization algorithms …
Momentum improves normalized sgd
A Cutkosky, H Mehta - International conference on machine …, 2020 - proceedings.mlr.press
We provide an improved analysis of normalized SGD showing that adding momentum
provably removes the need for large batch sizes on non-convex objectives. Then, we …
provably removes the need for large batch sizes on non-convex objectives. Then, we …
A single-timescale method for stochastic bilevel optimization
Stochastic bilevel optimization generalizes the classic stochastic optimization from the
minimization of a single objective to the minimization of an objective function that depends …
minimization of a single objective to the minimization of an objective function that depends …
A Single-Timescale Method for Stochastic Bilevel Optimization
Stochastic bilevel optimization generalizes the classic stochastic optimization from the
minimization of a single objective to the minimization of an objective function that depends …
minimization of a single objective to the minimization of an objective function that depends …
A general sample complexity analysis of vanilla policy gradient
We adapt recent tools developed for the analysis of Stochastic Gradient Descent (SGD) in
non-convex optimization to obtain convergence and sample complexity guarantees for the …
non-convex optimization to obtain convergence and sample complexity guarantees for the …
A unified convergence analysis for shuffling-type gradient methods
In this paper, we propose a unified convergence analysis for a class of generic shuffling-type
gradient methods for solving finite-sum optimization problems. Our analysis works with any …
gradient methods for solving finite-sum optimization problems. Our analysis works with any …
Accelerated zeroth-order and first-order momentum methods from mini to minimax optimization
In the paper, we propose a class of accelerated zeroth-order and first-order momentum
methods for both nonconvex mini-optimization and minimax-optimization. Specifically, we …
methods for both nonconvex mini-optimization and minimax-optimization. Specifically, we …
On momentum-based gradient methods for bilevel optimization with nonconvex lower-level
F Huang - arXiv preprint arXiv:2303.03944, 2023 - arxiv.org
Bilevel optimization is a popular two-level hierarchical optimization, which has been widely
applied to many machine learning tasks such as hyperparameter learning, meta learning …
applied to many machine learning tasks such as hyperparameter learning, meta learning …
Momentum-based policy gradient methods
In the paper, we propose a class of efficient momentum-based policy gradient methods for
the model-free reinforcement learning, which use adaptive learning rates and do not require …
the model-free reinforcement learning, which use adaptive learning rates and do not require …