[图书][B] Model-based clustering and classification for data science: with applications in R
Cluster analysis finds groups in data automatically. Most methods have been heuristic and
leave open such central questions as: how many clusters are there? Which method should I …
leave open such central questions as: how many clusters are there? Which method should I …
Identifying mislabeled data using the area under the margin ranking
Not all data in a typical training set help with generalization; some samples can be overly
ambiguous or outrightly mislabeled. This paper introduces a new method to identify such …
ambiguous or outrightly mislabeled. This paper introduces a new method to identify such …
Classification in the presence of label noise: a survey
B Frénay, M Verleysen - IEEE transactions on neural networks …, 2013 - ieeexplore.ieee.org
Label noise is an important issue in classification, with many potential negative
consequences. For example, the accuracy of predictions may decrease, whereas the …
consequences. For example, the accuracy of predictions may decrease, whereas the …
SMOTE–IPF: Addressing the noisy and borderline examples problem in imbalanced classification by a re-sampling method with filtering
Classification datasets often have an unequal class distribution among their examples. This
problem is known as imbalanced classification. The Synthetic Minority Over-sampling …
problem is known as imbalanced classification. The Synthetic Minority Over-sampling …
Self-paced ensemble for highly imbalanced massive data classification
Many real-world applications reveal difficulties in learning classifiers from imbalanced data.
The rising big data era has been witnessing more classification tasks with large-scale but …
The rising big data era has been witnessing more classification tasks with large-scale but …
[HTML][HTML] A machine learning perspective on the development of clinical decision support systems utilizing mass spectra of blood samples
H Shin, MK Markey - Journal of Biomedical Informatics, 2006 - Elsevier
Currently, the best way to reduce the mortality of cancer is to detect and treat it in the earliest
stages. Technological advances in genomics and proteomics have opened a new realm of …
stages. Technological advances in genomics and proteomics have opened a new realm of …
Types of minority class examples and their influence on learning classifiers from imbalanced data
K Napierala, J Stefanowski - Journal of Intelligent Information Systems, 2016 - Springer
Many real-world applications reveal difficulties in learning classifiers from imbalanced data.
Although several methods for improving classifiers have been introduced, the identification …
Although several methods for improving classifiers have been introduced, the identification …
Class noise vs. attribute noise: A quantitative study
Real-world data is never perfect and can often suffer from corruptions (noise) that may
impact interpretations of the data, models created from the data and decisions made based …
impact interpretations of the data, models created from the data and decisions made based …
Analyzing the oversampling of different classes and types of examples in multi-class imbalanced datasets
Canonical machine learning algorithms assume that the number of objects in the considered
classes are roughly similar. However, in many real-life situations the distribution of examples …
classes are roughly similar. However, in many real-life situations the distribution of examples …
Computational intelligence techniques for medical diagnosis and prognosis: Problems and current developments
Diagnosis, being the first step in medical practice, is very crucial for clinical decision making.
This paper investigates state-of-the-art computational intelligence (CI) techniques applied in …
This paper investigates state-of-the-art computational intelligence (CI) techniques applied in …