Data-dependent coresets for compressing neural networks with applications to generalization bounds

C Baykal, L Liebenwein, I Gilitschenski… - arXiv preprint arXiv …, 2018 - arxiv.org
We present an efficient coresets-based neural network compression algorithm that sparsifies
the parameters of a trained fully-connected neural network in a manner that provably …

The unreasonable effectiveness of structured random orthogonal embeddings

KM Choromanski, M Rowland… - Advances in neural …, 2017 - proceedings.neurips.cc
We examine a class of embeddings based on structured random matrices with orthogonal
rows which can be applied in many machine learning applications including dimensionality …

On the expressive power of self-attention matrices

V Likhosherstov, K Choromanski, A Weller - arXiv preprint arXiv …, 2021 - arxiv.org
Transformer networks are able to capture patterns in data coming from many domains (text,
images, videos, proteins, etc.) with little or no change to architecture components. We …

Sensitivity-informed provable pruning of neural networks

C Baykal, L Liebenwein, I Gilitschenski… - SIAM Journal on …, 2022 - SIAM
We introduce a family of pruning algorithms that sparsifies the parameters of a trained model
in a way that approximately preserves the model's predictive accuracy. Our algorithms use a …

The geometry of random features

K Choromanski, M Rowland, T Sarlós… - International …, 2018 - proceedings.mlr.press
We present an in-depth examination of the effectiveness of radial basis function kernel
(beyond Gaussian) estimators based on orthogonal random feature maps. We show that …

Recycling randomness with structure for sublinear time kernel expansions

K Choromanski, V Sindhwani - International Conference on …, 2016 - proceedings.mlr.press
We propose a scheme for recycling Gaussian random vectors into structured matrices to ap-
proximate various kernel functions in sublin-ear time via random embeddings. Our frame …

Structured adaptive and random spinners for fast machine learning computations

M Bojarski, A Choromanska… - Artificial intelligence …, 2017 - proceedings.mlr.press
We consider an efficient computational framework for speeding up several machine learning
algorithms with almost no loss of accuracy. The proposed framework relies on projections …

[PDF][PDF] FROSH: FasteR Online Sketching Hashing.

X Chen, I King, MR Lyu - UAI, 2017 - auai.org
Many hashing methods, especially those that are in the data-dependent category with good
learning accuracy, are still inefficient when dealing with three critical problems in modern …

Binary vectors for fast distance and similarity estimation

DA Rachkovskij - Cybernetics and Systems Analysis, 2017 - Springer
This review considers methods and algorithms for fast estimation of distance/similarity
measures between initial data from vector representations with binary or integer-valued …

On binary embedding using circulant matrices

XY Felix, A Bhaskara, S Kumar, Y Gong… - Journal of Machine …, 2018 - jmlr.org
Binary embeddings provide efficient and powerful ways to perform operations on large scale
data. However binary embedding typically requires long codes in order to preserve the …