A survey on locality sensitive hashing algorithms and their applications
Finding nearest neighbors in high-dimensional spaces is a fundamental operation in many
diverse application domains. Locality Sensitive Hashing (LSH) is one of the most popular …
diverse application domains. Locality Sensitive Hashing (LSH) is one of the most popular …
Approximate Nearest Neighbor Search in High Dimensional Vector Databases: Current Research and Future Directions.
Approximate nearest neighbor search is an important research topic with a wide range of
applications. In this study, we first introduce the problem and review major research results …
applications. In this study, we first introduce the problem and review major research results …
DB-LSH 2.0: Locality-sensitive hashing with query-based dynamic bucketing
Locality-sensitive hashing (LSH) is a promising family of methods for the high-dimensional
approximate nearest neighbor (ANN) search problem due to its sub-linear query time and …
approximate nearest neighbor (ANN) search problem due to its sub-linear query time and …
LOTUS: Enabling Semantic Queries with LLMs Over Tables of Unstructured and Structured Data
The semantic capabilities of language models (LMs) have the potential to enable rich
analytics and reasoning over vast knowledge corpora. Unfortunately, existing systems lack …
analytics and reasoning over vast knowledge corpora. Unfortunately, existing systems lack …
Efficient approximate nearest neighbor search in multi-dimensional databases
Approximate nearest neighbor (ANN) search is a fundamental search in multi-dimensional
databases, which has numerous real-world applications, such as image retrieval …
databases, which has numerous real-world applications, such as image retrieval …
Dumpy: A compact and adaptive index for large data series collections
Data series indexes are necessary for managing and analyzing the increasing amounts of
data series collections that are nowadays available. These indexes support both exact and …
data series collections that are nowadays available. These indexes support both exact and …
FARGO: Fast maximum inner product search via global multi-probing
Maximum inner product search (MIPS) in high-dimensional spaces has wide applications
but is computationally expensive due to the curse of dimensionality. Existing studies employ …
but is computationally expensive due to the curse of dimensionality. Existing studies employ …
ACORN: Performant and Predicate-Agnostic Search Over Vector Embeddings and Structured Data
Applications increasingly leverage mixed-modality data, and must jointly search over vector
data, such as embedded images, text and video, as well as structured data, such as …
data, such as embedded images, text and video, as well as structured data, such as …
A New Sparse Data Clustering Method Based On Frequent Items
Large, sparse categorical data is a natural way to represent complex data like sequences,
trees, and graphs. Such data is prevalent in many applications, eg, Criteo released a …
trees, and graphs. Such data is prevalent in many applications, eg, Criteo released a …
HJG: An Effective Hierarchical Joint Graph for ANNS in Multi-Metric Spaces
Owing to the widespread deployment of smartphones and networked devices, massive
amount of data in different types are generated every day, including numeric data, locations …
amount of data in different types are generated every day, including numeric data, locations …