Accelerated optimization in deep learning with a proportional-integral-derivative controller

S Chen, J Liu, P Wang, C Xu, S Cai, J Chu - Nature Communications, 2024 - nature.com
High-performance optimization algorithms are essential in deep learning. However,
understanding the behavior of optimization (ie, learning process) remains challenging due …

Practical sharpness-aware minimization cannot converge all the way to optima

D Si, C Yun - Advances in Neural Information Processing …, 2024 - proceedings.neurips.cc
Abstract Sharpness-Aware Minimization (SAM) is an optimizer that takes a descent step
based on the gradient at a perturbation $ y_t= x_t+\rho\frac {\nabla f (x_t)}{\lVert\nabla f …

The crucial role of normalization in sharpness-aware minimization

Y Dai, K Ahn, S Sra - Advances in Neural Information …, 2024 - proceedings.neurips.cc
Abstract Sharpness-Aware Minimization (SAM) is a recently proposed gradient-based
optimizer (Foret et al., ICLR 2021) that greatly improves the prediction performance of deep …

SDEs for Minimax Optimization

EM Compagnoni, A Orvieto, H Kersting… - International …, 2024 - proceedings.mlr.press
Minimax optimization problems have attracted a lot of attention over the past few years, with
applications ranging from economics to machine learning. While advanced optimization …

Stabilizing Sharpness-aware Minimization Through A Simple Renormalization Strategy

C Tan, J Zhang, J Liu, Y Wang, Y Hao - arXiv preprint arXiv:2401.07250, 2024 - arxiv.org
Recently, sharpness-aware minimization (SAM) has attracted a lot of attention because of its
surprising effectiveness in improving generalization performance. However, training neural …

A Universal Class of Sharpness-Aware Minimization Algorithms

B Tahmasebi, A Soleymani, D Bahri, S Jegelka… - arXiv preprint arXiv …, 2024 - arxiv.org
Recently, there has been a surge in interest in developing optimization algorithms for
overparameterized models as achieving generalization is believed to require algorithms …

Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training

Z Zhou, M Wang, Y Mao, B Li, J Yan - arXiv preprint arXiv:2410.10373, 2024 - arxiv.org
Sharpness-Aware Minimization (SAM) has substantially improved the generalization of
neural networks under various settings. Despite the success, its effectiveness remains …

On statistical properties of sharpness-aware minimization: Provable guarantees

K Behdin, R Mazumder - arXiv preprint arXiv:2302.11836, 2023 - arxiv.org
Sharpness-Aware Minimization (SAM) is a recent optimization framework aiming to improve
the deep neural network generalization, through obtaining flatter (ie less sharp) solutions. As …

Adaptive Methods through the Lens of SDEs: Theoretical Insights on the Role of Noise

EM Compagnoni, T Liu, R Islamov, FN Proske… - arXiv preprint arXiv …, 2024 - arxiv.org
Despite the vast empirical evidence supporting the efficacy of adaptive optimization methods
in deep learning, their theoretical understanding is far from complete. This work introduces …

Exploring stochastic differential equation for analyzing uncertainty in wastewater treatment plant-activated sludge modeling

RS Zonouz, V Nourani, M Sayyah-Fard… - AQUA—Water …, 2024 - iwaponline.com
The management of wastewater treatment plant (WWTP) and the assessment of uncertainty
in its design are crucial from an environmental engineering perspective. One of the key …