Why does sharpness-aware minimization generalize better than SGD?
The challenge of overfitting, in which the model memorizes the training data and fails to
generalize to test data, has become increasingly significant in the training of large neural …
generalize to test data, has become increasingly significant in the training of large neural …
Friendly sharpness-aware minimization
Abstract Sharpness-Aware Minimization (SAM) has been instrumental in improving deep
neural network training by minimizing both training loss and loss sharpness. Despite the …
neural network training by minimizing both training loss and loss sharpness. Despite the …
[PDF][PDF] Sharpness-aware minimization: An implicit regularization perspective
K Behdin, R Mazumder - stat, 2023 - researchgate.net
Abstract Sharpness-Aware Minimization (SAM) is a recent optimization framework aiming to
improve the deep neural network generalization, through obtaining flatter (ie less sharp) …
improve the deep neural network generalization, through obtaining flatter (ie less sharp) …
Decentralized stochastic sharpness-aware minimization algorithm
In recent years, distributed stochastic algorithms have become increasingly useful in the
field of machine learning. However, similar to traditional stochastic algorithms, they face a …
field of machine learning. However, similar to traditional stochastic algorithms, they face a …
A Universal Class of Sharpness-Aware Minimization Algorithms
Recently, there has been a surge in interest in developing optimization algorithms for
overparameterized models as achieving generalization is believed to require algorithms …
overparameterized models as achieving generalization is believed to require algorithms …
On statistical properties of sharpness-aware minimization: Provable guarantees
K Behdin, R Mazumder - arXiv preprint arXiv:2302.11836, 2023 - arxiv.org
Sharpness-Aware Minimization (SAM) is a recent optimization framework aiming to improve
the deep neural network generalization, through obtaining flatter (ie less sharp) solutions. As …
the deep neural network generalization, through obtaining flatter (ie less sharp) solutions. As …