Optimal learning rates for kernel conjugate gradient regression

F Liu, X Huang, Y Chen… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org

The class of random features is one of the most popular techniques to speed up kernel
methods in large-scale problems. Related works have been recognized by the NeurIPS Test …

被引用次数：204 相关文章所有 9 个版本

[PDF] neurips.cc

Generalization properties of learning with random features

A Rudi, L Rosasco - Advances in neural information …, 2017 - proceedings.neurips.cc

We study the generalization properties of ridge regression with random features in the
statistical learning framework. We show for the first time that $ O (1/\sqrt {n}) $ learning …

被引用次数：392 相关文章所有 7 个版本

[PDF] jmlr.org

[PDF][PDF] Divide and conquer kernel ridge regression: A distributed algorithm with minimax optimal rates

Y Zhang, J Duchi, M Wainwright - The Journal of Machine Learning …, 2015 - jmlr.org

We study a decomposition-based scalable approach to kernel ridge regression, and show
that it achieves minimax optimal convergence rates under relatively mild conditions. The …

被引用次数：405 相关文章所有 11 个版本

[PDF] jmlr.org

[PDF][PDF] Early stopping and non-parametric regression: an optimal data-dependent stopping rule

G Raskutti, MJ Wainwright, B Yu - The Journal of Machine Learning …, 2014 - jmlr.org

Early stopping is a form of regularization based on choosing when to stop running an
iterative algorithm. Focusing on non-parametric regression in a reproducing kernel Hilbert …

被引用次数：389 相关文章所有 20 个版本

[PDF] jmlr.org

Distributed learning with regularized least squares

SB Lin, X Guo, DX Zhou - Journal of Machine Learning Research, 2017 - jmlr.org

We study distributed learning with the least squares regularization scheme in a reproducing
kernel Hilbert space (RKHS). By a divide-and-conquer approach, the algorithm partitions a …

被引用次数：231 相关文章所有 12 个版本

[PDF] mlr.press

Divide and conquer kernel ridge regression

Y Zhang, J Duchi, M Wainwright - Conference on learning …, 2013 - proceedings.mlr.press

We study a decomposition-based scalable approach to performing kernel ridge regression.
The method is simply described: it randomly partitions a dataset of size N into m subsets of …

被引用次数：294 相关文章所有 15 个版本

[PDF] neurips.cc

Sampling from Gaussian process posteriors using stochastic gradient descent

JA Lin, J Antorán, S Padhy, D Janz… - Advances in …, 2023 - proceedings.neurips.cc

Gaussian processes are a powerful framework for quantifying uncertainty and for sequential
decision-making but are limited by the requirement of solving linear systems. In general, this …

被引用次数：17 相关文章所有 7 个版本

[HTML] sciencedirect.com

[HTML][HTML] Early stopping by correlating online indicators in neural networks

MV Ferro, YD Mosquera, FJR Pena, VMD Bilbao - Neural Networks, 2023 - Elsevier

In order to minimize the generalization error in neural networks, a novel technique to identify
overfitting phenomena when training the learner is formally introduced. This enables support …

被引用次数：32 相关文章所有 7 个版本

[PDF] projecteuclid.org

Nonparametric stochastic approximation with large step-sizes

A Dieuleveut, F Bach - 2016 - projecteuclid.org

We consider the random-design least-squares regression problem within the reproducing
kernel Hilbert space (RKHS) framework. Given a stream of independent and identically …

被引用次数：216 相关文章所有 16 个版本

[PDF] researchgate.net

Learning theory of distributed spectral algorithms

ZC Guo, SB Lin, DX Zhou - Inverse Problems, 2017 - iopscience.iop.org

Spectral algorithms have been widely used and studied in learning theory and inverse
problems. This paper is concerned with distributed spectral algorithms, for handling big data …

被引用次数：150 相关文章所有 6 个版本