What is the long-run distribution of stochastic gradient descent? A large deviations analysis

W Azizian, F Iutzeler, J Malick… - arXiv preprint arXiv …, 2024 - arxiv.org
In this paper, we examine the long-run distribution of stochastic gradient descent (SGD) in
general, non-convex problems. Specifically, we seek to understand which regions of the …