Stochastic gradient descent with preconditioned polyak step-size

F Abdukhakimov, C Xiang, D Kamzolov… - … and Mathematical Physics, 2024 - Springer
Abstract Stochastic Gradient Descent (SGD) is one of the many iterative optimization
methods that are widely used in solving machine learning problems. These methods display …

Local methods with adaptivity via scaling

S Chezhegov, S Skorik, N Khachaturov… - arXiv preprint arXiv …, 2024 - arxiv.org
The rapid development of machine learning and deep learning has introduced increasingly
complex optimization challenges that must be addressed. Indeed, training modern …

Preconditioning meets biased compression for efficient distributed optimization

V Pirau, A Beznosikov, M Takáč, V Matyukhin… - Computational …, 2024 - Springer
Methods with preconditioned updates show up well in badly scaled and/or ill-conditioned
convex optimization problems. However, theoretical analysis of these methods in distributed …

OPTAMI: Global Superlinear Convergence of High-order Methods

D Kamzolov, D Pasechnyuk, A Agafonov… - arXiv preprint arXiv …, 2024 - arxiv.org
Second-order methods for convex optimization outperform first-order methods in terms of
theoretical iteration convergence, achieving rates up to $ O (k^{-5}) $ for highly-smooth …

Just a Simple Transformation is Enough for Data Protection in Vertical Federated Learning

A Semenov, P Zmushko, A Pichugin… - arXiv preprint arXiv …, 2024 - arxiv.org
Vertical Federated Learning (VFL) aims to enable collaborative training of deep learning
models while maintaining privacy protection. However, the VFL procedure still has …

Ai-sarah: Adaptive and implicit stochastic recursive gradient methods

Z Shi, A Sadiev, N Loizou, P Richtárik… - arXiv preprint arXiv …, 2021 - arxiv.org
We present AI-SARAH, a practical variant of SARAH. As a variant of SARAH, this algorithm
employs the stochastic recursive gradient yet adjusts step-size based on local geometry. AI …

Exploring Jacobian Inexactness in Second-Order Methods for Variational Inequalities: Lower Bounds, Optimal Algorithms and Quasi-Newton Approximations

A Agafonov, P Ostroukhov, R Mozhaev… - arXiv preprint arXiv …, 2024 - arxiv.org
Variational inequalities represent a broad class of problems, including minimization and min-
max problems, commonly found in machine learning. Existing second-order and high-order …

Effects of momentum scaling for SGD

DA Pasechnyuk, A Gasnikov, M Takáč - arXiv preprint arXiv:2210.11869, 2022 - arxiv.org
The paper studies the properties of stochastic gradient methods with preconditioning. We
focus on momentum updated preconditioners with momentum coefficient $\beta $. Seeking …

Учредители: Федеральный исследовательский центр" Информатика и управление" РАН, Российская академия наук

Ф АБДУХАКИМОВ, Ч СЯН, Д КАМЗОЛОВ… - ЖУРНАЛ …, 2024 - elibrary.ru
Стохастический градиентный спуск (SGD) является одним из множества методов
оптимизации, используемых для решения задач машинного обучения. Практичность и …