[图书][B] Model-based clustering and classification for data science: with applications in R

C Bouveyron, G Celeux, TB Murphy, AE Raftery - 2019 - books.google.com
Cluster analysis finds groups in data automatically. Most methods have been heuristic and
leave open such central questions as: how many clusters are there? Which method should I …

Identifying mislabeled data using the area under the margin ranking

G Pleiss, T Zhang, E Elenberg… - Advances in Neural …, 2020 - proceedings.neurips.cc
Not all data in a typical training set help with generalization; some samples can be overly
ambiguous or outrightly mislabeled. This paper introduces a new method to identify such …

Classification in the presence of label noise: a survey

B Frénay, M Verleysen - IEEE transactions on neural networks …, 2013 - ieeexplore.ieee.org
Label noise is an important issue in classification, with many potential negative
consequences. For example, the accuracy of predictions may decrease, whereas the …

SMOTE–IPF: Addressing the noisy and borderline examples problem in imbalanced classification by a re-sampling method with filtering

JA Sáez, J Luengo, J Stefanowski, F Herrera - Information Sciences, 2015 - Elsevier
Classification datasets often have an unequal class distribution among their examples. This
problem is known as imbalanced classification. The Synthetic Minority Over-sampling …

Self-paced ensemble for highly imbalanced massive data classification

Z Liu, W Cao, Z Gao, J Bian, H Chen… - 2020 IEEE 36th …, 2020 - ieeexplore.ieee.org
Many real-world applications reveal difficulties in learning classifiers from imbalanced data.
The rising big data era has been witnessing more classification tasks with large-scale but …

[HTML][HTML] A machine learning perspective on the development of clinical decision support systems utilizing mass spectra of blood samples

H Shin, MK Markey - Journal of Biomedical Informatics, 2006 - Elsevier
Currently, the best way to reduce the mortality of cancer is to detect and treat it in the earliest
stages. Technological advances in genomics and proteomics have opened a new realm of …

Types of minority class examples and their influence on learning classifiers from imbalanced data

K Napierala, J Stefanowski - Journal of Intelligent Information Systems, 2016 - Springer
Many real-world applications reveal difficulties in learning classifiers from imbalanced data.
Although several methods for improving classifiers have been introduced, the identification …

Class noise vs. attribute noise: A quantitative study

X Zhu, X Wu - Artificial intelligence review, 2004 - Springer
Real-world data is never perfect and can often suffer from corruptions (noise) that may
impact interpretations of the data, models created from the data and decisions made based …

Analyzing the oversampling of different classes and types of examples in multi-class imbalanced datasets

JA Sáez, B Krawczyk, M Woźniak - Pattern Recognition, 2016 - Elsevier
Canonical machine learning algorithms assume that the number of objects in the considered
classes are roughly similar. However, in many real-life situations the distribution of examples …

Computational intelligence techniques for medical diagnosis and prognosis: Problems and current developments

AH Shahid, MP Singh - Biocybernetics and Biomedical Engineering, 2019 - Elsevier
Diagnosis, being the first step in medical practice, is very crucial for clinical decision making.
This paper investigates state-of-the-art computational intelligence (CI) techniques applied in …