Billion-scale similarity search with GPUs

J Johnson, M Douze, H Jégou - IEEE Transactions on Big Data, 2019 - ieeexplore.ieee.org
Similarity search finds application in database systems handling complex data such as
images or videos, which are typically represented by high-dimensional features and require …

Milvus: A purpose-built vector data management system

J Wang, X Yi, R Guo, H Jin, P Xu, S Li, X Wang… - Proceedings of the …, 2021 - dl.acm.org
Recently, there has been a pressing need to manage high-dimensional vector data in data
science and AI applications. This trend is fueled by the proliferation of unstructured data and …

A survey on learning to hash

J Wang, T Zhang, N Sebe… - IEEE transactions on …, 2017 - ieeexplore.ieee.org
Nearest neighbor search is a problem of finding the data points from the database such that
the distances from them to the query point are the smallest. Learning to hash is one of the …

Spann: Highly-efficient billion-scale approximate nearest neighborhood search

Q Chen, B Zhao, H Wang, M Li, C Liu… - Advances in …, 2021 - proceedings.neurips.cc
The in-memory algorithms for approximate nearest neighbor search (ANNS) have achieved
great success for fast high-recall search, but are extremely expensive when handling very …

Approximate nearest neighbor search on high dimensional data—experiments, analyses, and improvement

W Li, Y Zhang, Y Sun, W Wang, M Li… - … on Knowledge and …, 2019 - ieeexplore.ieee.org
Nearest neighbor search is a fundamental and essential operation in applications from
many domains, such as databases, machine learning, multimedia, and computer vision …

The inverted multi-index

A Babenko, V Lempitsky - IEEE transactions on pattern …, 2014 - ieeexplore.ieee.org
A new data structure for efficient similarity search in very large datasets of high-dimensional
vectors is introduced. This structure called the inverted multi-index generalizes the inverted …

K-means hashing: An affinity-preserving quantization method for learning binary compact codes

K He, F Wen, J Sun - … of the IEEE conference on computer …, 2013 - openaccess.thecvf.com
In computer vision there has been increasing interest in learning hashing codes whose
Hamming distance approximates the data similarity. The hashing functions play roles in both …

Efficient indexing of billion-scale datasets of deep descriptors

A Babenko, V Lempitsky - Proceedings of the IEEE …, 2016 - openaccess.thecvf.com
Existing billion-scale nearest neighbor search systems have mostly been compared on a
single dataset of a billion of SIFT vectors, where systems based on the Inverted Multi-Index …

Locally optimized product quantization for approximate nearest neighbor search

Y Kalantidis, Y Avrithis - Proceedings of the IEEE …, 2014 - openaccess.thecvf.com
We present a simple vector quantizer that combines low distortion with fast search and apply
it to approximate nearest neighbor (ANN) search in high dimensional spaces. Leveraging …

Composite quantization for approximate nearest neighbor search

T Zhang, C Du, J Wang - International Conference on …, 2014 - proceedings.mlr.press
This paper presents a novel compact coding approach, composite quantization, for
approximate nearest neighbor search. The idea is to use the composition of several …