High-dimensional similarity query processing for data science

J Qin, W Wang, C Xiao, Y Zhang, Y Wang - Proceedings of the 27th ACM …, 2021 - dl.acm.org
Similarity query (aka nearest neighbor query) processing has been an active research topic
for several decades. It is an essential procedure in a wide range of applications (eg …

Space and time efficient kernel density estimation in high dimensions

A Backurs, P Indyk, T Wagner - Advances in neural …, 2019 - proceedings.neurips.cc
Abstract Recently, Charikar and Siminelakis (2017) presented a framework for kernel
density estimation in provably sublinear query time, for kernels that possess a certain …

{VBASE}: Unifying Online Vector Similarity Search and Relational Queries via Relaxed Monotonicity

Q Zhang, S Xu, Q Chen, G Sui, J Xie, Z Cai… - … USENIX Symposium on …, 2023 - usenix.org
Approximate similarity queries on high-dimensional vector indices have become the
cornerstone for many critical online services. An increasing need for more sophisticated …

Rehashing kernel evaluation in high dimensions

P Siminelakis, K Rong, P Bailis… - International …, 2019 - proceedings.mlr.press
Kernel methods are effective but do not scale well to large scale data, especially in high
dimensions where the geometric data structures used to accelerate kernel evaluation suffer …

Kernel density estimation through density constrained near neighbor search

M Charikar, M Kapralov, N Nouri… - 2020 IEEE 61st …, 2020 - ieeexplore.ieee.org
In this paper we revisit the kernel density estimation problem: given a kernel K (x, y) and a
dataset of n points in high dimensional Euclidean space, prepare a data structure that can …

Monotonic cardinality estimation of similarity selection: A deep learning approach

Y Wang, C Xiao, J Qin, X Cao, Y Sun, W Wang… - Proceedings of the …, 2020 - dl.acm.org
In this paper, we investigate the possibilities of utilizing deep learning for cardinality
estimation of similarity selection. Answering this problem accurately and efficiently is …

Fast rotation kernel density estimation over data streams

R Lei, P Wang, R Li, P Jia, J Zhao, X Guan… - Proceedings of the 27th …, 2021 - dl.acm.org
Kernel density estimation method is a powerful tool and is widely used in many important
real-world applications such as anomaly detection and statistical learning. Unfortunately …

[PDF][PDF] Similarity query processing for high-dimensional data

J Qin, W Wang, C Xiao, Y Zhang - Proceedings of the VLDB …, 2020 - opus.lib.uts.edu.au
Similarity query processing has been an active research topic for several decades. It is an
essential procedure in a wide range of applications. Recently, embedding and auto …

Consistent and flexible selectivity estimation for high-dimensional data

Y Wang, C Xiao, J Qin, R Mao, M Onizuka… - Proceedings of the …, 2021 - dl.acm.org
Selectivity estimation aims at estimating the number of database objects that satisfy a
selection criterion. Answering this problem accurately and efficiently is essential to many …

Cardinality estimation of activity trajectory similarity queries using deep learning

R Tian, W Zhang, F Wang, J Zhou, A Alhudhaif… - Information …, 2023 - Elsevier
Cardinality estimation, which involves estimating the result size of queries, is a critical aspect
of query processing and optimization. Deep Neural Networks (DNNs) are data hungry, and …