EI-LSH: An early-termination driven I/O efficient incremental c-approximate nearest neighbor search

O Jafari, P Maurya, P Nagarkar, KM Islam… - arXiv preprint arXiv …, 2021 - arxiv.org

Finding nearest neighbors in high-dimensional spaces is a fundamental operation in many
diverse application domains. Locality Sensitive Hashing (LSH) is one of the most popular …

被引用次数：87 相关文章所有 3 个版本

Approximate Nearest Neighbor Search in High Dimensional Vector Databases: Current Research and Future Directions.

Y Tian, Z Yue, R Zhang, X Zhao, B Zheng… - IEEE Data Eng …, 2023 - sites.computer.org

Approximate nearest neighbor search is an important research topic with a wide range of
applications. In this study, we first introduce the problem and review major research results …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

DB-LSH 2.0: Locality-sensitive hashing with query-based dynamic bucketing

Y Tian, X Zhao, X Zhou - IEEE Transactions on Knowledge and …, 2023 - ieeexplore.ieee.org

Locality-sensitive hashing (LSH) is a promising family of methods for the high-dimensional
approximate nearest neighbor (ANN) search problem due to its sub-linear query time and …

被引用次数：21 相关文章所有 10 个版本

[PDF] arxiv.org

LOTUS: Enabling Semantic Queries with LLMs Over Tables of Unstructured and Structured Data

L Patel, S Jha, C Guestrin, M Zaharia - arXiv preprint arXiv:2407.11418, 2024 - arxiv.org

The semantic capabilities of language models (LMs) have the potential to enable rich
analytics and reasoning over vast knowledge corpora. Unfortunately, existing systems lack …

被引用次数：3 相关文章所有 2 个版本

[PDF] hkbu.edu.hk

Efficient approximate nearest neighbor search in multi-dimensional databases

Y Peng, B Choi, TN Chan, J Yang, J Xu - … of the ACM on Management of …, 2023 - dl.acm.org

Approximate nearest neighbor (ANN) search is a fundamental search in multi-dimensional
databases, which has numerous real-world applications, such as image retrieval …

被引用次数：14 相关文章所有 4 个版本

[PDF] arxiv.org

Dumpy: A compact and adaptive index for large data series collections

Z Wang, Q Wang, P Wang, T Palpanas… - Proceedings of the ACM …, 2023 - dl.acm.org

Data series indexes are necessary for managing and analyzing the increasing amounts of
data series collections that are nowadays available. These indexes support both exact and …

被引用次数：14 相关文章所有 7 个版本

[PDF] vldb.org

FARGO: Fast maximum inner product search via global multi-probing

X Zhao, B Zheng, X Yi, X Luan, C Xie, X Zhou… - Proceedings of the …, 2023 - dl.acm.org

Maximum inner product search (MIPS) in high-dimensional spaces has wide applications
but is computationally expensive due to the curse of dimensionality. Existing studies employ …

被引用次数：6 相关文章所有 5 个版本

[PDF] arxiv.org

ACORN: Performant and Predicate-Agnostic Search Over Vector Embeddings and Structured Data

L Patel, P Kraft, C Guestrin, M Zaharia - … of the ACM on Management of …, 2024 - dl.acm.org

Applications increasingly leverage mixed-modality data, and must jointly search over vector
data, such as embedded images, text and video, as well as structured data, such as …

被引用次数：2 相关文章所有 3 个版本

[PDF] acm.org

A New Sparse Data Clustering Method Based On Frequent Items

Q Huang, P Luo, AKH Tung - Proceedings of the ACM on Management …, 2023 - dl.acm.org

Large, sparse categorical data is a natural way to represent complex data like sequences,
trees, and graphs. Such data is prevalent in many applications, eg, Criteo released a …

被引用次数：1 相关文章所有 3 个版本

HJG: An Effective Hierarchical Joint Graph for ANNS in Multi-Metric Spaces

Y Zhu, L Chen, Y Gao, R Ma, B Zheng… - 2024 IEEE 40th …, 2024 - ieeexplore.ieee.org

Owing to the widespread deployment of smartphones and networked devices, massive
amount of data in different types are generated every day, including numeric data, locations …