Comparing dimension reduction techniques for document clustering

H Xiong, J Wu, J Chen - Proceedings of the 12th ACM SIGKDD …, 2006 - dl.acm.org

K-means is a widely used partitional clustering method. While there are considerable
research efforts to characterize the key features of K-means clustering, further investigation …

被引用次数：420 相关文章所有 18 个版本

Cluster analysis and K-means clustering: an introduction

J Wu, J Wu - Advances in K-Means clustering: A data mining …, 2012 - Springer

The phrase “data mining” was termed in the late eighties of the last century, which describes
the activity that attempts to extract interesting patterns from data. Since then, data mining and …

被引用次数：164 相关文章所有 2 个版本

[HTML] acm.org

An improved ant algorithm with LDA-based representation for text document clustering

A Onan, H Bulut, S Korukoglu - Journal of Information …, 2017 - journals.sagepub.com

Document clustering can be applied in document organisation and browsing, document
summarisation and classification. The identification of an appropriate representation for …

被引用次数：82 相关文章所有 3 个版本

[PDF] arxiv.org

An introduction to johnson-lindenstrauss transforms

CB Freksen - arXiv preprint arXiv:2103.00564, 2021 - arxiv.org

Johnson--Lindenstrauss Transforms are powerful tools for reducing the dimensionality of
data while preserving key characteristics of that data, and they have found use in many …

被引用次数：14 相关文章所有 2 个版本

Towards understanding hierarchical clustering: A data distribution perspective

J Wu, H Xiong, J Chen - Neurocomputing, 2009 - Elsevier

A very important category of clustering methods is hierarchical clustering. There are
considerable research efforts which have been focused on algorithm-level improvements of …

被引用次数：63 相关文章所有 4 个版本

[PDF] academia.edu

An online document clustering technique for short web contents

M Carullo, E Binaghi, I Gallo - Pattern Recognition Letters, 2009 - Elsevier

Document clustering techniques have been applied in several areas, with the web as one of
the most recent and influential. Both general-purpose and text-oriented techniques exist and …

被引用次数：69 相关文章所有 9 个版本

[PDF] 140.127.22.205

Research of fast SOM clustering for text information

Y Liu, C Wu, M Liu - Expert Systems with Applications, 2011 - Elsevier

The state-of-the-art text clustering methods suffer from the huge size of documents with high-
dimensional features. In this paper, we studied fast SOM clustering technology for Text …

被引用次数：54 相关文章所有 4 个版本

Combining semantic and term frequency similarities for text clustering

VHA Soares, RJGB Campello… - … and Information Systems, 2019 - Springer

A key challenge for document clustering consists in finding a proper similarity measure for
text documents that enables the generation of cohesive groups. Measures based on the …

被引用次数：24 相关文章所有 7 个版本

[PDF] lums.edu.pk

CDIM: document clustering by discrimination information maximization

MT Hassan, A Karim, JB Kim, M Jeon - Information Sciences, 2015 - Elsevier

Ideally, document clustering methods should produce clusters that are semantically relevant
and readily understandable as collections of documents belonging to particular contexts or …

被引用次数：30 相关文章所有 6 个版本

Singular Value Decomposition for dimensionality reduction in unsupervised text learning problems

TF Abidin, B Yusuf, M Umran - 2010 2nd International …, 2010 - ieeexplore.ieee.org

Partitioning vast amounts of text documents is a challenging problem due to a high
dimensional representation of the documents. In this study, we investigate the quality of text …

被引用次数：26 相关文章所有 2 个版本