Subspace clustering meets dense subgraph mining: A synthesis of two paradigms
S Günnemann, I Färber, B Boden… - 2010 IEEE international …, 2010 - ieeexplore.ieee.org
2010 IEEE international conference on data mining, 2010•ieeexplore.ieee.org
Today's applications deal with multiple types of information: graph data to represent the
relations between objects and attribute data to characterize single objects. Analyzing both
data sources simultaneously can increase the quality of mining methods. Recently,
combined clustering approaches were introduced, which detect densely connected node
sets within one large graph that also show high similarity according to all of their attribute
values. However, for attribute data it is known that this full-space clustering often leads to …
relations between objects and attribute data to characterize single objects. Analyzing both
data sources simultaneously can increase the quality of mining methods. Recently,
combined clustering approaches were introduced, which detect densely connected node
sets within one large graph that also show high similarity according to all of their attribute
values. However, for attribute data it is known that this full-space clustering often leads to …
Today's applications deal with multiple types of information: graph data to represent the relations between objects and attribute data to characterize single objects. Analyzing both data sources simultaneously can increase the quality of mining methods. Recently, combined clustering approaches were introduced, which detect densely connected node sets within one large graph that also show high similarity according to all of their attribute values. However, for attribute data it is known that this full-space clustering often leads to poor clustering results. Thus, subspace clustering was introduced to identify locally relevant subsets of attributes for each cluster. In this work, we propose a method for finding homogeneous groups by joining the paradigms of subspace clustering and dense sub graph mining, i.e. we determine sets of nodes that show high similarity in subsets of their dimensions and that are as well densely connected within the given graph. Our twofold clusters are optimized according to their density, size, and number of relevant dimensions. Our developed redundancy model confines the clustering to a manageable size of only the most interesting clusters. We introduce the algorithm Gamer for the efficient calculation of our clustering. In thorough experiments on synthetic and real world data we show that Gamer achieves low runtimes and high clustering qualities.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果