Optimality of the Johnson-Lindenstrauss lemma

Y Luan, J Eisenstein, K Toutanova… - Transactions of the …, 2021 - direct.mit.edu

Dual encoders perform retrieval by encoding documents and queries into dense low-
dimensional vectors, scoring each document by its inner product with the query. We …

被引用次数：411 相关文章所有 8 个版本

[PDF] plos.org

The specious art of single-cell genomics

T Chari, L Pachter - PLOS Computational Biology, 2023 - journals.plos.org

Dimensionality reduction is standard practice for filtering noise and identifying relevant
features in large-scale data analyses. In biology, single-cell genomics studies typically begin …

被引用次数：266 相关文章所有 15 个版本

[PDF] mlr.press

A Nearly-Optimal Bound for Fast Regression with Guarantee

Z Song, M Ye, J Yin, L Zhang - International Conference on …, 2023 - proceedings.mlr.press

Given a matrix $ A\in\mathbb {R}^{n\times d} $ and a vector $ b\in\mathbb {R}^ n $, we
consider the regression problem with $\ell_\infty $ guarantees: finding a vector …

被引用次数：9 相关文章所有 2 个版本

[PDF] arxiv.org

A new coreset framework for clustering

V Cohen-Addad, D Saulpic… - Proceedings of the 53rd …, 2021 - dl.acm.org

Given a metric space, the (k, z)-clustering problem consists of finding k centers such that the
sum of the of distances raised to the power z of every point to its closest center is minimized …

被引用次数：76 相关文章所有 11 个版本

[PDF] arxiv.org

Performance of Johnson-Lindenstrauss transform for k-means and k-medians clustering

K Makarychev, Y Makarychev… - Proceedings of the 51st …, 2019 - dl.acm.org

Consider an instance of Euclidean k-means or k-medians clustering. We show that the cost
of the optimal solution is preserved up to a factor of (1+ ε) under a projection onto a random …

被引用次数：146 相关文章所有 8 个版本

[PDF] arxiv.org

Towards optimal lower bounds for k-median and k-means coresets

V Cohen-Addad, KG Larsen, D Saulpic… - Proceedings of the 54th …, 2022 - dl.acm.org

The (k, z)-clustering problem consists of finding a set of k points called centers, such that the
sum of distances raised to the power of z of every data point to its closest center is …

被引用次数：54 相关文章所有 11 个版本

[PDF] arxiv.org

Neural ODE control for classification, approximation, and transport

D Ruiz-Balet, E Zuazua - SIAM Review, 2023 - SIAM

We analyze neural ordinary differential equations (NODEs) from a control theoretical
perspective to address some of the main properties and paradigms of deep learning (DL), in …

被引用次数：82 相关文章所有 13 个版本

[PDF] arxiv.org

t-SNE-CUDA: GPU-Accelerated t-SNE and its Applications to Modern Data

DM Chan, R Rao, F Huang… - 2018 30th International …, 2018 - ieeexplore.ieee.org

Modern datasets and models are notoriously difficult to explore and analyze due to their
inherent high dimensionality and massive numbers of samples. Existing visualization …

被引用次数：120 相关文章所有 8 个版本

[PDF] nsf.gov

Training (overparametrized) neural networks in near-linear time

J Brand, B Peng, Z Song, O Weinstein - arXiv preprint arXiv:2006.11648, 2020 - arxiv.org

The slow convergence rate and pathological curvature issues of first-order gradient methods
for training deep neural networks, initiated an ongoing effort for developing faster $\mathit …

被引用次数：85 相关文章所有 11 个版本

[PDF] sciencedirect.com

GPU accelerated t-distributed stochastic neighbor embedding

DM Chan, R Rao, F Huang, JF Canny - Journal of Parallel and Distributed …, 2019 - Elsevier

Modern datasets and models are notoriously difficult to explore and analyze due to their
inherent high dimensionality and massive numbers of samples. Existing visualization …

被引用次数：89 相关文章所有 4 个版本