A guide through the zoo of biased SGD

Y Demidovich, G Malinovsky… - Advances in Neural …, 2023 - proceedings.neurips.cc
Abstract Stochastic Gradient Descent (SGD) is arguably the most important single algorithm
in modern machine learning. Although SGD with unbiased gradient estimators has been …

SEGA: Variance reduction via gradient sketching

F Hanzely, K Mishchenko… - Advances in Neural …, 2018 - proceedings.neurips.cc
We propose a novel randomized first order optimization method---SEGA (SkEtched GrAdient
method)---which progressively throughout its iterations builds a variance-reduced estimate …

Prompt-tuning decision transformer with preference ranking

S Hu, L Shen, Y Zhang, D Tao - arXiv preprint arXiv:2305.09648, 2023 - arxiv.org
Prompt-tuning has emerged as a promising method for adapting pre-trained models to
downstream tasks or aligning with human preferences. Prompt learning is widely used in …

Scalable subspace methods for derivative-free nonlinear least-squares optimization

C Cartis, L Roberts - Mathematical Programming, 2023 - Springer
We introduce a general framework for large-scale model-based derivative-free optimization
based on iterative minimization within random subspaces. We present a probabilistic worst …

Direct search based on probabilistic descent in reduced spaces

L Roberts, CW Royer - SIAM Journal on Optimization, 2023 - SIAM
Derivative-free algorithms seek the minimum value of a given objective function without
using any derivative information. The performance of these methods often worsens as the …

Combating adversaries with anti-adversaries

M Alfarra, JC Pérez, A Thabet, A Bibi… - Proceedings of the …, 2022 - ojs.aaai.org
Deep neural networks are vulnerable to small input perturbations known as adversarial
attacks. Inspired by the fact that these adversaries are constructed by iteratively minimizing …

Zeroth-order algorithms for stochastic distributed nonconvex optimization

X Yi, S Zhang, T Yang, KH Johansson - Automatica, 2022 - Elsevier
In this paper, we consider a stochastic distributed nonconvex optimization problem with the
cost function being distributed over n agents having access only to zeroth-order (ZO) …

Zeroth-order regularized optimization (zoro): Approximately sparse gradients and adaptive sampling

HQ Cai, D McKenzie, W Yin, Z Zhang - SIAM Journal on Optimization, 2022 - SIAM
We consider the problem of minimizing a high-dimensional objective function, which may
include a regularization term, using only (possibly noisy) evaluations of the function. Such …

Monte Carlo tree descent for black-box optimization

Y Zhai, S Gao - Advances in Neural Information Processing …, 2022 - proceedings.neurips.cc
Abstract The key to Black-Box Optimization is to efficiently search through input regions with
potentially widely-varying numerical properties, to achieve low-regret descent and fast …

An Evolutionary field theorem: Evolutionary field optimization in training of power-weighted multiplicative neurons for nitrogen oxides-sensitive electronic nose …

BB Alagoz, OI Simsek, D Ari, A Tepljakov, E Petlenkov… - Sensors, 2022 - mdpi.com
Neuroevolutionary machine learning is an emerging topic in the evolutionary computation
field and enables practical modeling solutions for data-driven engineering applications …