Painless stochastic gradient: Interpolation, line-search, and convergence rates

S Vaswani, A Mishkin, I Laradji… - Advances in neural …, 2019 - proceedings.neurips.cc
Recent works have shown that stochastic gradient descent (SGD) achieves the fast
convergence rates of full-batch gradient descent for over-parameterized models satisfying …

Backtracking gradient descent method and some applications in large scale optimisation. Part 2: Algorithms and experiments

TT Truong, HT Nguyen - Applied Mathematics & Optimization, 2021 - Springer
In this paper, we provide new results and algorithms (including backtracking versions of
Nesterov accelerated gradient and Momentum) which are more applicable to large scale …

Why line search when you can plane search? so-friendly neural networks allow per-iteration optimization of learning and momentum rates for every layer

B Shea, M Schmidt - arXiv preprint arXiv:2406.17954, 2024 - arxiv.org
We introduce the class of SO-friendly neural networks, which include several models used in
practice including networks with 2 layers of hidden weights where the number of inputs is …

Convergence to minima for the continuous version of backtracking gradient descent

TT Truong - arXiv preprint arXiv:1911.04221, 2019 - arxiv.org
The main result of this paper is:{\bf Theorem.} Let $ f:\mathbb {R}^ k\rightarrow\mathbb {R} $
be a $ C^{1} $ function, so that $\nabla f $ is locally Lipschitz continuous. Assume moreover …

Some convergent results for Backtracking Gradient Descent method on Banach spaces

TT Truong - arXiv preprint arXiv:2001.05768, 2020 - arxiv.org
Our main result concerns the following condition:{\bf Condition C.} Let $ X $ be a Banach
space. A $ C^ 1$ function $ f: X\rightarrow\mathbb {R} $ satisfies Condition C if whenever …

Fast Forwarding Low-Rank Training

A Rahamim, N Saphra, S Kangaslahti… - arXiv preprint arXiv …, 2024 - arxiv.org
Parameter efficient finetuning methods like low-rank adaptation (LoRA) aim to reduce the
computational costs of finetuning pretrained Language Models (LMs). Enabled by these low …

Adaptive Backtracking For Faster Optimization

JV Cavalcanti, L Lessard, AC Wilson - arXiv preprint arXiv:2408.13150, 2024 - arxiv.org
Backtracking line search is foundational in numerical optimization. The basic idea is to
adjust the step size of an algorithm by a constant factor until some chosen criterion (eg …

Unconstrained optimisation on Riemannian manifolds

TT Truong - arXiv preprint arXiv:2008.11091, 2020 - arxiv.org
In this paper, we give explicit descriptions of versions of (Local-) Backtracking Gradient
Descent and New Q-Newton's method to the Riemannian setting. Here are some easy to …

Backtracking gradient descent allowing unbounded learning rates

TT Truong - arXiv preprint arXiv:2001.02005, 2020 - arxiv.org
In unconstrained optimisation on an Euclidean space, to prove convergence in Gradient
Descent processes (GD) $ x_ {n+ 1}= x_n-\delta _n\nabla f (x_n) $ it usually is required that …

Inertial Newton algorithms avoiding strict saddle points

C Castera - Journal of Optimization Theory and Applications, 2023 - Springer
We study the asymptotic behavior of second-order algorithms mixing Newton's method and
inertial gradient descent in non-convex landscapes. We show that, despite the Newtonian …