High-dimensional similarity query processing for data science

J Qin, W Wang, C Xiao, Y Zhang, Y Wang - Proceedings of the 27th ACM …, 2021 - dl.acm.org
Similarity query (aka nearest neighbor query) processing has been an active research topic
for several decades. It is an essential procedure in a wide range of applications (eg …

Deep non-crossing quantiles through the partial derivative

A Brando, BS Center… - International …, 2022 - proceedings.mlr.press
Quantile Regression (QR) provides a way to approximate a single conditional quantile. To
have a more informative description of the conditional distribution, QR can be merged with …

Cardinality estimation of approximate substring queries using deep learning

S Kwon, W Jung, K Shim - Proceedings of the VLDB Endowment, 2022 - dl.acm.org
Cardinality estimation of an approximate substring query is an important problem in
database systems. Traditional approaches build a summary from the text data and estimate …

Database optimizers in the era of learning

D Tsesmelis, A Simitsis - 2022 IEEE 38th International …, 2022 - ieeexplore.ieee.org
In this tutorial, we review advances made recently in a decades-old problem, namely query
optimization. Along with the traditional optimization techniques many of which are still being …

[PDF][PDF] Similarity query processing for high-dimensional data

J Qin, W Wang, C Xiao, Y Zhang - Proceedings of the VLDB …, 2020 - opus.lib.uts.edu.au
Similarity query processing has been an active research topic for several decades. It is an
essential procedure in a wide range of applications. Recently, embedding and auto …

Selectivity functions of range queries are learnable

X Hu, Y Liu, H Xiu, PK Agarwal, D Panigrahi… - Proceedings of the …, 2022 - dl.acm.org
This paper explores the use of machine learning for estimating the selectivity of range
queries in database systems. Using classic learning theory for real-valued functions based …

HAP: an efficient hamming space index based on augmented pigeonhole principle

Q Liu, Y Shen, L Chen - … of the 2022 International Conference on …, 2022 - dl.acm.org
The emerging deep learning techniques prefer mapping complex data objects (eg, images,
documents) to compact binary vectors (ie, hash codes) for efficient similarity search. In this …

Learned probing cardinality estimation for high-dimensional approximate NN search

B Zheng, Z Yue, Q Hu, X Yi, X Luan… - 2023 IEEE 39th …, 2023 - ieeexplore.ieee.org
Approximate nearest neighbor (ANN) search in high-dimensional space plays an essential
role in a variety of real-world applications. A well-known solution to ANN search, inverted file …

Learning-based query optimization for multi-probe approximate nearest neighbor search

P Zhang, B Yao, C Gao, B Wu, X He, F Li, Y Lu… - The VLDB Journal, 2023 - Springer
Approximate nearest neighbor search (ANNS) is a fundamental problem that has attracted
widespread attention for decades. Multi-probe ANNS is one of the most important classes of …

Consistent and flexible selectivity estimation for high-dimensional data

Y Wang, C Xiao, J Qin, R Mao, M Onizuka… - Proceedings of the …, 2021 - dl.acm.org
Selectivity estimation aims at estimating the number of database objects that satisfy a
selection criterion. Answering this problem accurately and efficiently is essential to many …