Stochastic gradient descent with preconditioned polyak step-size
Abstract Stochastic Gradient Descent (SGD) is one of the many iterative optimization
methods that are widely used in solving machine learning problems. These methods display …
methods that are widely used in solving machine learning problems. These methods display …
Local methods with adaptivity via scaling
S Chezhegov, S Skorik, N Khachaturov… - arXiv preprint arXiv …, 2024 - arxiv.org
The rapid development of machine learning and deep learning has introduced increasingly
complex optimization challenges that must be addressed. Indeed, training modern …
complex optimization challenges that must be addressed. Indeed, training modern …
Preconditioning meets biased compression for efficient distributed optimization
Methods with preconditioned updates show up well in badly scaled and/or ill-conditioned
convex optimization problems. However, theoretical analysis of these methods in distributed …
convex optimization problems. However, theoretical analysis of these methods in distributed …
OPTAMI: Global Superlinear Convergence of High-order Methods
Second-order methods for convex optimization outperform first-order methods in terms of
theoretical iteration convergence, achieving rates up to $ O (k^{-5}) $ for highly-smooth …
theoretical iteration convergence, achieving rates up to $ O (k^{-5}) $ for highly-smooth …
Just a Simple Transformation is Enough for Data Protection in Vertical Federated Learning
Vertical Federated Learning (VFL) aims to enable collaborative training of deep learning
models while maintaining privacy protection. However, the VFL procedure still has …
models while maintaining privacy protection. However, the VFL procedure still has …
Ai-sarah: Adaptive and implicit stochastic recursive gradient methods
We present AI-SARAH, a practical variant of SARAH. As a variant of SARAH, this algorithm
employs the stochastic recursive gradient yet adjusts step-size based on local geometry. AI …
employs the stochastic recursive gradient yet adjusts step-size based on local geometry. AI …
Exploring Jacobian Inexactness in Second-Order Methods for Variational Inequalities: Lower Bounds, Optimal Algorithms and Quasi-Newton Approximations
Variational inequalities represent a broad class of problems, including minimization and min-
max problems, commonly found in machine learning. Existing second-order and high-order …
max problems, commonly found in machine learning. Existing second-order and high-order …
Effects of momentum scaling for SGD
The paper studies the properties of stochastic gradient methods with preconditioning. We
focus on momentum updated preconditioners with momentum coefficient $\beta $. Seeking …
focus on momentum updated preconditioners with momentum coefficient $\beta $. Seeking …
Учредители: Федеральный исследовательский центр" Информатика и управление" РАН, Российская академия наук
Ф АБДУХАКИМОВ, Ч СЯН, Д КАМЗОЛОВ… - ЖУРНАЛ …, 2024 - elibrary.ru
Стохастический градиентный спуск (SGD) является одним из множества методов
оптимизации, используемых для решения задач машинного обучения. Практичность и …
оптимизации, используемых для решения задач машинного обучения. Практичность и …