The specious art of single-cell genomics
Dimensionality reduction is standard practice for filtering noise and identifying relevant
features in large-scale data analyses. In biology, single-cell genomics studies typically begin …
features in large-scale data analyses. In biology, single-cell genomics studies typically begin …
Optimality of the Johnson-Lindenstrauss lemma
For any d, n≥ 2 and 1/(min {n, d}) 0.4999<; ε<; 1, we show the existence of a set of n vectors
X⊂ ℝ d such that any embedding f: X→ ℝ m satisfying∀ x, y∈ X,(1-ε)∥ xy∥ 2 2≤∥ f (x)-f …
X⊂ ℝ d such that any embedding f: X→ ℝ m satisfying∀ x, y∈ X,(1-ε)∥ xy∥ 2 2≤∥ f (x)-f …
Random-projection ensemble classification
TI Cannings, RJ Samworth - Journal of the Royal Statistical …, 2017 - academic.oup.com
We introduce a very general method for high dimensional classification, based on careful
combination of the results of applying an arbitrary base classifier to random projections of …
combination of the results of applying an arbitrary base classifier to random projections of …
8: low-distortion embeddings of finite metric spaces
P Indyk, J Matoušek, A Sidiropoulos - Handbook of discrete and …, 2017 - taylorfrancis.com
An n-point metric space (X, D) can be represented by an n× n $ n\times n $ https://s3-euw1-
ap-pe-df-pch-content-public-p. s3. eu-west-1. amazonaws. com/9781315119601/fb8178cb …
ap-pe-df-pch-content-public-p. s3. eu-west-1. amazonaws. com/9781315119601/fb8178cb …
Random projections: Data perturbation for classification problems
TI Cannings - Wiley Interdisciplinary Reviews: Computational …, 2021 - Wiley Online Library
Random projections offer an appealing and flexible approach to a wide range of large‐scale
statistical problems. They are particularly useful in high‐dimensional settings, where we …
statistical problems. They are particularly useful in high‐dimensional settings, where we …
Oblivious dimension reduction for k-means: beyond subspaces and the Johnson-Lindenstrauss lemma
We show that for n points in d-dimensional Euclidean space, a data oblivious random
projection of the columns onto m∈ O ((log k+ loglog n) ε− 6log1/ε) dimensions is sufficient to …
projection of the columns onto m∈ O ((log k+ loglog n) ε− 6log1/ε) dimensions is sufficient to …
Coresets-methods and history: A theoreticians design pattern for approximation and streaming algorithms
A Munteanu, C Schwiegelshohn - KI-Künstliche Intelligenz, 2018 - Springer
We present a technical survey on the state of the art approaches in data reduction and the
coreset framework. These include geometric decompositions, gradient methods, random …
coreset framework. These include geometric decompositions, gradient methods, random …
Topp&r: Robust support estimation approach for evaluating fidelity and diversity in generative models
We propose a robust and reliable evaluation metric for generative models called
Topological Precision and Recall (TopP&R, pronounced “topper”), which systematically …
Topological Precision and Recall (TopP&R, pronounced “topper”), which systematically …
Toward a unified theory of sparse dimensionality reduction in euclidean space
Let Φ∈ Rm xn be a sparse Johnson-Lindenstrauss transform [52] with column sparsity s.
For a subset T of the unit sphere and ε∈(0, 1/2), we study settings for m, s to ensure EΦ …
For a subset T of the unit sphere and ε∈(0, 1/2), we study settings for m, s to ensure EΦ …