Sketching as a tool for numerical linear algebra

DP Woodruff - … and Trends® in Theoretical Computer Science, 2014 - nowpublishers.com
This survey highlights the recent advances in algorithms for numerical linear algebra that
have come from the technique of linear sketching, whereby given a matrix, one first …

Turning Big Data Into Tiny Data: Constant-Size Coresets for -Means, PCA, and Projective Clustering

D Feldman, M Schmidt, C Sohler - SIAM Journal on Computing, 2020 - SIAM
We develop and analyze a method to reduce the size of a very large set of data points in a
high-dimensional Euclidean space R^d to a small set of weighted points such that the result …

Dimensionality reduction for k-means clustering and low rank approximation

MB Cohen, S Elder, C Musco, C Musco… - Proceedings of the forty …, 2015 - dl.acm.org
We show how to approximate a data matrix A with a much smaller sketch~ A that can be
used to solve a general class of constrained k-rank approximation problems to within (1+ ε) …

Practical sketching algorithms for low-rank matrix approximation

JA Tropp, A Yurtsever, M Udell, V Cevher - SIAM Journal on Matrix Analysis …, 2017 - SIAM
This paper describes a suite of algorithms for constructing low-rank approximations of an
input matrix from a random linear image, or sketch, of the matrix. These methods can …

A framework for Bayesian optimization in embedded subspaces

A Nayebi, A Munteanu… - … Conference on Machine …, 2019 - proceedings.mlr.press
We present a theoretically founded approach for high-dimensional Bayesian optimization
based on low-dimensional subspace embeddings. We prove that the error in the Gaussian …

Performance of Johnson-Lindenstrauss transform for k-means and k-medians clustering

K Makarychev, Y Makarychev… - Proceedings of the 51st …, 2019 - dl.acm.org
Consider an instance of Euclidean k-means or k-medians clustering. We show that the cost
of the optimal solution is preserved up to a factor of (1+ ε) under a projection onto a random …

Oblivious sketching of high-degree polynomial kernels

TD Ahle, M Kapralov, JBT Knudsen, R Pagh… - Proceedings of the …, 2020 - SIAM
Kernel methods are fundamental tools in machine learning that allow detection of non-linear
dependencies between data without explicitly constructing feature vectors in high …

Tanimoto random features for scalable molecular machine learning

A Tripp, S Bacallado, S Singh… - Advances in Neural …, 2024 - proceedings.neurips.cc
The Tanimoto coefficient is commonly used to measure the similarity between molecules
represented as discrete fingerprints, either as a distance metric or a positive definite kernel …

Randomized sketches for kernels: Fast and optimal nonparametric regression

Y Yang, M Pilanci, MJ Wainwright - 2017 - projecteuclid.org
Kernel ridge regression (KRR) is a standard method for performing nonparametric
regression over reproducing kernel Hilbert spaces. Given n samples, the time and space …

Faster kernel ridge regression using sketching and preconditioning

H Avron, KL Clarkson, DP Woodruff - SIAM Journal on Matrix Analysis and …, 2017 - SIAM
Kernel ridge regression is a simple yet powerful technique for nonparametric regression
whose computation amounts to solving a linear system. This system is usually dense and …