A study of condition numbers for first-order optimization

M Liu, L Chen, X Du, L Jin… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org

Deep neural networks often suffer from poor performance or even training failure due to the
ill-conditioned problem, the vanishing/exploding gradient problem, and the saddle point …

被引用次数：157 相关文章所有 7 个版本

Long short-term memory with activation on gradient

C Qin, L Chen, Z Cai, M Liu, L Jin - Neural Networks, 2023 - Elsevier

As the number of long short-term memory (LSTM) layers increases, vanishing/exploding
gradient problems exacerbate and have a negative impact on the performance of the LSTM …

被引用次数：11 相关文章所有 5 个版本

[PDF] arxiv.org

Optimal first-order methods for convex functions with a quadratic upper bound

B Goujaud, A Taylor, A Dieuleveut - arXiv preprint arXiv:2205.15033, 2022 - arxiv.org

We analyze worst-case convergence guarantees of first-order optimization methods over a
function class extending that of smooth and convex functions. This class contains convex …

被引用次数：13 相关文章所有 4 个版本

[PDF] neurips.cc

Gradient descent is optimal under lower restricted secant inequality and upper error bound

C Guille-Escuret, A Ibrahim… - Advances in Neural …, 2022 - proceedings.neurips.cc

The study of first-order optimization is sensitive to the assumptions made on the objective
functions. These assumptions induce complexity classes which play a key role in worst-case …

被引用次数：12 相关文章所有 5 个版本

[PDF] arxiv.org

An exponentially converging particle method for the mixed nash equilibrium of continuous games

G Wang, L Chizat - arXiv preprint arXiv:2211.01280, 2022 - arxiv.org

We consider the problem of computing mixed Nash equilibria of two-player zero-sum games
with continuous sets of pure strategies and with first-order access to the payoff function. This …

被引用次数：10 相关文章所有 3 个版本

[PDF] ieee.org

Communication-efficient federated learning: A second order newton-type method with analog over-the-air aggregation

M Krouka, A Elgabli, CB Issaid… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Owing to their fast convergence, second-order Newton-type learning methods have recently
received attention in the federated learning (FL) setting. However, current solutions are …

被引用次数：10 相关文章所有 3 个版本

[PDF] nsf.gov

On the Convergence of AdaGrad (Norm) on R^ d: Beyond Convexity, Non-Asymptotic Rate and Acceleration

Z Liu, TD Nguyen, A Ene, H Nguyen - International Conference on …, 2023 - par.nsf.gov

Existing analysis of AdaGrad and other adaptive methods for smooth convex optimization is
typically for functions with bounded domain diameter. In unconstrained problems, previous …

被引用次数：6 相关文章所有 5 个版本

[PDF] ieee.org Full View

DIN: A decentralized inexact Newton algorithm for consensus optimization

A Ghalkha, CB Issaid, A Elgabli… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

This paper tackles a challenging decentralized consensus optimization problem defined
over a network of interconnected devices. The devices work collaboratively to solve a …

被引用次数：2 相关文章所有 4 个版本

Constrained minimum variance and covariance steering based on affine disturbance feedback control parameterization

IM Balci, E Bakolas - … Journal of Robust and Nonlinear Control, 2024 - Wiley Online Library

This paper deals with finite‐horizon minimum‐variance and covariance steering problems
subject to constraints. The goal of the minimum variance problem is to steer the state mean …

[PDF] arxiv.org

Mean-Field Langevin Dynamics for Signed Measures via a Bilevel Approach

G Wang, A Moussavi-Hosseini, L Chizat - arXiv preprint arXiv:2406.17054, 2024 - arxiv.org

Mean-field Langevin dynamics (MLFD) is a class of interacting particle methods that tackle
convex optimization over probability measures on a manifold, which are scalable, versatile …