Learning in presence of class imbalance and class overlapping by using one-class SVM and undersampling technique
Connection Science, 2019•Taylor & Francis
The class imbalance problem engraves the traditional learning models by degrading
performance and yielding erroneous outcomes. It is the scenario where one of the class
representation is over-shadowed by other classes in a data space. Presence of class
imbalance can cause a grave difficulty as misclassification cost of minority class tends to be
very high. Presence of overlapping cases along with the case of imbalanced data, can lead
to create grim situation for effective learning. In this study, an in-depth analysis of the effects …
performance and yielding erroneous outcomes. It is the scenario where one of the class
representation is over-shadowed by other classes in a data space. Presence of class
imbalance can cause a grave difficulty as misclassification cost of minority class tends to be
very high. Presence of overlapping cases along with the case of imbalanced data, can lead
to create grim situation for effective learning. In this study, an in-depth analysis of the effects …
Abstract
The class imbalance problem engraves the traditional learning models by degrading performance and yielding erroneous outcomes. It is the scenario where one of the class representation is over-shadowed by other classes in a data space. Presence of class imbalance can cause a grave difficulty as misclassification cost of minority class tends to be very high. Presence of overlapping cases along with the case of imbalanced data, can lead to create grim situation for effective learning. In this study, an in-depth analysis of the effects of class imbalance and class overlapping in conventional learning models has been presented. A data level approach is adapted with one-class SVM-based anomaly detection to detect the cases of data overlapping while an adapted Tomek-link undersampling algorithm is defined to treat both overlapped and imbalanced cases. The proposed model evolves to eliminate borderline, redundant and overlapping cases with the account of Tomek-link pair and sparse neighbourhood. The proposed method has been evaluated with six state-of-the-art models for seven binary and two multiclass datasets, with respect to three standard learning models. The proposed model has been evaluated with cost-sensitive learning and extreme learning based approaches for imbalanced class learning. The proficiency of the proposed method over state-of-the-art models is established through experimental analyses.
Taylor & Francis Online
以上显示的是最相近的搜索结果。 查看全部搜索结果