Effects of dataset characteristics on the performance of feature selection techniques

D Oreski, S Oreski, B Klicek - Applied Soft Computing, 2017 - Elsevier
D Oreski, S Oreski, B Klicek
Applied Soft Computing, 2017Elsevier
While extensive research in data mining has been devoted to developing better feature
selection techniques, none of this research has examined the intrinsic relationship between
dataset characteristics and a feature selection technique's performance. Thus, our research
examines experimentally how dataset characteristics affect both the accuracy and the time
complexity of feature selection. To evaluate the performance of various feature selection
techniques on datasets of different characteristics, extensive experiments with five feature …
Abstract
While extensive research in data mining has been devoted to developing better feature selection techniques, none of this research has examined the intrinsic relationship between dataset characteristics and a feature selection technique’s performance. Thus, our research examines experimentally how dataset characteristics affect both the accuracy and the time complexity of feature selection. To evaluate the performance of various feature selection techniques on datasets of different characteristics, extensive experiments with five feature selection techniques, three types of classification algorithms, seven types of dataset characterization methods and all possible combinations of dataset characteristics are conducted on 128 publicly available datasets. We apply the decision tree method to evaluate the interdependencies between dataset characteristics and performance. The results of the study reveal the intrinsic relationship between dataset characteristics and feature selection techniques’ performance. Additionally, our study contributes to research in data mining by providing a roadmap for future research on feature selection and a significantly wider framework for comparative analysis.
Elsevier
以上显示的是最相近的搜索结果。 查看全部搜索结果