Generalized Continuous-Time Models for Nesterov's Accelerated Gradient Methods
Recent research has indicated a substantial rise in interest in understanding Nesterov's
accelerated gradient methods via their continuous-time models. However, most existing …
accelerated gradient methods via their continuous-time models. However, most existing …
Accelerated convex optimization with stochastic gradients: Generalizing the strong-growth condition
This paper presents a sufficient condition for stochastic gradients not to slow down the
convergence of Nesterov's accelerated gradient method. The new condition has the strong …
convergence of Nesterov's accelerated gradient method. The new condition has the strong …
Resonance in Weight Space: Covariate Shift Can Drive Divergence of SGD with Momentum
Most convergence guarantees for stochastic gradient descent with momentum (SGDm) rely
on iid sampling. Yet, SGDm is often used outside this regime, in settings with temporally …
on iid sampling. Yet, SGDm is often used outside this regime, in settings with temporally …
Strange springs in many dimensions: how parametric resonance can explain divergence under covariate shift.
K Banman - 2021 - era.library.ualberta.ca
Most convergence guarantees for stochastic gradient descent with momentum (SGDm) rely
on independently and identically ditributed (iid) data sampling. Yet, SGDm is often used …
on independently and identically ditributed (iid) data sampling. Yet, SGDm is often used …