Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases

C Böhm, S Berchtold, DA Keim - ACM Computing Surveys (CSUR), 2001 - dl.acm.org
During the last decade, multimedia databases have become increasingly important in many
application areas such as medicine, CAD, geography, and molecular biology. An important …

Pathsim: Meta path-based top-k similarity search in heterogeneous information networks

Y Sun, J Han, X Yan, PS Yu, T Wu - Proceedings of the VLDB …, 2011 - dl.acm.org
Similarity search is a primitive operation in database and Web search engines. With the
advent of large-scale heterogeneous information networks that consist of multi-typed …

Mining heterogeneous information networks: a structural analysis approach

Y Sun, J Han - ACM SIGKDD explorations newsletter, 2013 - dl.acm.org
Most objects and data in the real world are of multiple types, interconnected, forming
complex, heterogeneous but often semi-structured information networks. However, most …

On the surprising behavior of distance metrics in high dimensional space

CC Aggarwal, A Hinneburg, DA Keim - … London, UK, January 4–6, 2001 …, 2001 - Springer
In recent years, the effect of the curse of high dimensionality has been studied in great detail
on several problems such as clustering, nearest neighbor search, and indexing. In high …

Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering

HP Kriegel, P Kröger, A Zimek - … on knowledge discovery from data (tkdd …, 2009 - dl.acm.org
As a prolific research area in data mining, subspace clustering and related problems
induced a vast quantity of proposed solutions. However, many publications compare a new …

Ranking-based clustering of heterogeneous information networks with star network schema

Y Sun, Y Yu, J Han - Proceedings of the 15th ACM SIGKDD international …, 2009 - dl.acm.org
A heterogeneous information network is an information network composed of multiple types
of objects. Clustering on such a network may lead to better understanding of both hidden …

iDistance: An adaptive B+-tree based indexing method for nearest neighbor search

HV Jagadish, BC Ooi, KL Tan, C Yu… - ACM Transactions on …, 2005 - dl.acm.org
In this article, we present an efficient B+-tree based indexing method, called iDistance, for K-
nearest neighbor (KNN) search in a high-dimensional metric space. iDistance partitions the …

Hetesim: A general framework for relevance measure in heterogeneous networks

C Shi, X Kong, Y Huang, SY Philip… - IEEE Transactions on …, 2014 - ieeexplore.ieee.org
Similarity search is an important function in many applications, which usually focuses on
measuring the similarity between objects with the same type. However, in many scenarios …

A survey of query result diversification

K Zheng, H Wang, Z Qi, J Li, H Gao - Knowledge and Information Systems, 2017 - Springer
Nowadays, in information systems such as web search engines and databases, diversity is
becoming increasingly essential and getting more and more attention for improving users' …

[PDF][PDF] Voronoi-based k nearest neighbor search for spatial network databases

M Kolahdouzan, C Shahabi - … of the Thirtieth international conference on …, 2004 - vldb.org
A frequent type of query in spatial networks (eg, road networks) is to find the K nearest
neighbors (KNN) of a given query object. With these networks, the distances between …