A guide through the zoo of biased SGD
Y Demidovich, G Malinovsky… - Advances in Neural …, 2023 - proceedings.neurips.cc
Abstract Stochastic Gradient Descent (SGD) is arguably the most important single algorithm
in modern machine learning. Although SGD with unbiased gradient estimators has been …
in modern machine learning. Although SGD with unbiased gradient estimators has been …
SEGA: Variance reduction via gradient sketching
F Hanzely, K Mishchenko… - Advances in Neural …, 2018 - proceedings.neurips.cc
We propose a novel randomized first order optimization method---SEGA (SkEtched GrAdient
method)---which progressively throughout its iterations builds a variance-reduced estimate …
method)---which progressively throughout its iterations builds a variance-reduced estimate …
Prompt-tuning decision transformer with preference ranking
Prompt-tuning has emerged as a promising method for adapting pre-trained models to
downstream tasks or aligning with human preferences. Prompt learning is widely used in …
downstream tasks or aligning with human preferences. Prompt learning is widely used in …
Scalable subspace methods for derivative-free nonlinear least-squares optimization
We introduce a general framework for large-scale model-based derivative-free optimization
based on iterative minimization within random subspaces. We present a probabilistic worst …
based on iterative minimization within random subspaces. We present a probabilistic worst …
Direct search based on probabilistic descent in reduced spaces
Derivative-free algorithms seek the minimum value of a given objective function without
using any derivative information. The performance of these methods often worsens as the …
using any derivative information. The performance of these methods often worsens as the …
Combating adversaries with anti-adversaries
Deep neural networks are vulnerable to small input perturbations known as adversarial
attacks. Inspired by the fact that these adversaries are constructed by iteratively minimizing …
attacks. Inspired by the fact that these adversaries are constructed by iteratively minimizing …
Zeroth-order algorithms for stochastic distributed nonconvex optimization
In this paper, we consider a stochastic distributed nonconvex optimization problem with the
cost function being distributed over n agents having access only to zeroth-order (ZO) …
cost function being distributed over n agents having access only to zeroth-order (ZO) …
Zeroth-order regularized optimization (zoro): Approximately sparse gradients and adaptive sampling
We consider the problem of minimizing a high-dimensional objective function, which may
include a regularization term, using only (possibly noisy) evaluations of the function. Such …
include a regularization term, using only (possibly noisy) evaluations of the function. Such …
Monte Carlo tree descent for black-box optimization
Abstract The key to Black-Box Optimization is to efficiently search through input regions with
potentially widely-varying numerical properties, to achieve low-regret descent and fast …
potentially widely-varying numerical properties, to achieve low-regret descent and fast …
An Evolutionary field theorem: Evolutionary field optimization in training of power-weighted multiplicative neurons for nitrogen oxides-sensitive electronic nose …
Neuroevolutionary machine learning is an emerging topic in the evolutionary computation
field and enables practical modeling solutions for data-driven engineering applications …
field and enables practical modeling solutions for data-driven engineering applications …