Robust unsupervised feature selection via dual self-representation and manifold regularization

C Tang, X Liu, M Li, P Wang, J Chen, L Wang… - Knowledge-Based …, 2018 - Elsevier
C Tang, X Liu, M Li, P Wang, J Chen, L Wang, W Li
Knowledge-Based Systems, 2018Elsevier
Unsupervised feature selection has become an important and challenging pre-processing
step in machine learning and data mining since large amount of unlabelled high
dimensional data are often required to be processed. In this paper, we propose an efficient
method for robust unsupervised feature selection via dual self-representation and manifold
regularization, referred to as DSRMR briefly. On the one hand, a feature self-representation
term is used to learn the feature representation coefficient matrix to measure the importance …
Abstract
Unsupervised feature selection has become an important and challenging pre-processing step in machine learning and data mining since large amount of unlabelled high dimensional data are often required to be processed. In this paper, we propose an efficient method for robust unsupervised feature selection via dual self-representation and manifold regularization, referred to as DSRMR briefly. On the one hand, a feature self-representation term is used to learn the feature representation coefficient matrix to measure the importance of different feature dimensions. On the other hand, a sample self-representation term is used to automatically learn the sample similarity graph to preserve the local geometrical structure of data which has been verified critical in unsupervised feature selection. By using l2,1-norm to regularize the feature representation residual matrix and representation coefficient matrix, our method is robustness to outliers, and the row sparsity of the feature coefficient matrix induced by l2,1-norm can effectively select representative features. During the optimization process, the feature coefficient matrix and sample similarity graph constrain each other to obtain optimal solution. Experimental results on ten real-world data sets demonstrate that the proposed method can effectively identify important features, outperforming many state-of-the-art unsupervised feature selection methods in terms of clustering accuracy (ACC) and normalized mutual information (NMI).
Elsevier
以上显示的是最相近的搜索结果。 查看全部搜索结果