Discovery of rare cells from voluminous single cell expression data

A Jindal, P Gupta, Jayadeva, D Sengupta - Nature communications, 2018 - nature.com
Single cell messenger RNA sequencing (scRNA-seq) provides a window into transcriptional
landscapes in complex tissues. The recent introduction of droplet based transcriptomics …

Modeling LSH for performance tuning

W Dong, Z Wang, W Josephson, M Charikar… - Proceedings of the 17th …, 2008 - dl.acm.org
Although Locality-Sensitive Hashing (LSH) is a promising approach to similarity search in
high-dimensional spaces, it has not been considered practical partly because its search …

Ensemble dimensionality reduction and feature gene extraction for single-cell RNA-seq data

X Sun, Y Liu, L An - Nature Communications, 2020 - nature.com
Single-cell RNA sequencing (scRNA-seq) technologies allow researchers to uncover the
biological states of a single cell at high resolution. For computational efficiency and easy …

Asymmetric distance estimation with sketches for similarity search in high-dimensional spaces

W Dong, M Charikar, K Li - Proceedings of the 31st annual international …, 2008 - dl.acm.org
Efficient similarity search in high-dimensional spaces is important to content-based retrieval
systems. Recent studies have shown that sketches can effectively approximate L 1 distance …

Fast locally weighted PLS modeling for large-scale industrial processes

X Zhang, C Wei, Z Song - Industrial & Engineering Chemistry …, 2020 - ACS Publications
Locally weighted partial least-squares (LW-PLS) is an efficient just-in-time (JIT) modeling
method, which can handle process collinearity, nonlinearity, and time-varying …

[PDF][PDF] iDEC: indexable distance estimating codes for approximate nearest neighbor search

L Gong, H Wang, M Ogihara, J Xu - Proceedings of the VLDB …, 2020 - par.nsf.gov
ABSTRACT Approximate Nearest Neighbor (ANN) search is a fundamental algorithmic
problem, with numerous applications in many areas of computer science. In this work, we …

Large scale hamming distance query processing

AX Liu, K Shen, E Torng - 2011 IEEE 27th International …, 2011 - ieeexplore.ieee.org
Hamming distance has been widely used in many application domains, such as near-
duplicate detection and pattern recognition. We study Hamming distance range query …

Binary vectors for fast distance and similarity estimation

DA Rachkovskij - Cybernetics and Systems Analysis, 2017 - Springer
This review considers methods and algorithms for fast estimation of distance/similarity
measures between initial data from vector representations with binary or integer-valued …

Binary sketches for secondary filtering

V Mic, D Novak, P Zezula - ACM Transactions on Information Systems …, 2018 - dl.acm.org
This article addresses the problem of matching the most similar data objects to a given query
object. We adopt a generic model of similarity that involves the domain of objects and metric …

Fast Approximate Nearest Neighbor Search with a Dynamic Exploration Graph using Continuous Refinement

N Hezel, KU Barthel, K Schall, K Jung - arXiv preprint arXiv:2307.10479, 2023 - arxiv.org
For approximate nearest neighbor search, graph-based algorithms have shown to offer the
best trade-off between accuracy and search time. We propose the Dynamic Exploration …