Experiments with noise filtering in a medical domain

C Bouveyron, G Celeux, TB Murphy, AE Raftery - 2019 - books.google.com

Cluster analysis finds groups in data automatically. Most methods have been heuristic and
leave open such central questions as: how many clusters are there? Which method should I …

被引用次数：330 相关文章所有 10 个版本

[PDF] neurips.cc

Identifying mislabeled data using the area under the margin ranking

G Pleiss, T Zhang, E Elenberg… - Advances in Neural …, 2020 - proceedings.neurips.cc

Not all data in a typical training set help with generalization; some samples can be overly
ambiguous or outrightly mislabeled. This paper introduces a new method to identify such …

被引用次数：277 相关文章所有 7 个版本

[PDF] unamur.be

Classification in the presence of label noise: a survey

B Frénay, M Verleysen - IEEE transactions on neural networks …, 2013 - ieeexplore.ieee.org

Label noise is an important issue in classification, with many potential negative
consequences. For example, the accuracy of predictions may decrease, whereas the …

被引用次数：2072 相关文章所有 13 个版本

[PDF] 150.214.190.154

SMOTE–IPF: Addressing the noisy and borderline examples problem in imbalanced classification by a re-sampling method with filtering

JA Sáez, J Luengo, J Stefanowski, F Herrera - Information Sciences, 2015 - Elsevier

Classification datasets often have an unequal class distribution among their examples. This
problem is known as imbalanced classification. The Synthetic Minority Over-sampling …

被引用次数：641 相关文章所有 9 个版本

[PDF] arxiv.org

Self-paced ensemble for highly imbalanced massive data classification

Z Liu, W Cao, Z Gao, J Bian, H Chen… - 2020 IEEE 36th …, 2020 - ieeexplore.ieee.org

Many real-world applications reveal difficulties in learning classifiers from imbalanced data.
The rising big data era has been witnessing more classification tasks with large-scale but …

被引用次数：192 相关文章所有 10 个版本

[HTML] sciencedirect.com

[HTML][HTML] A machine learning perspective on the development of clinical decision support systems utilizing mass spectra of blood samples

H Shin, MK Markey - Journal of Biomedical Informatics, 2006 - Elsevier

Currently, the best way to reduce the mortality of cancer is to detect and treat it in the earliest
stages. Technological advances in genomics and proteomics have opened a new realm of …

被引用次数：147 相关文章所有 8 个版本

[PDF] springer.com

Types of minority class examples and their influence on learning classifiers from imbalanced data

K Napierala, J Stefanowski - Journal of Intelligent Information Systems, 2016 - Springer

Many real-world applications reveal difficulties in learning classifiers from imbalanced data.
Although several methods for improving classifiers have been introduced, the identification …

被引用次数：320 相关文章所有 10 个版本

[PDF] psu.edu

Class noise vs. attribute noise: A quantitative study

X Zhu, X Wu - Artificial intelligence review, 2004 - Springer

Real-world data is never perfect and can often suffer from corruptions (noise) that may
impact interpretations of the data, models created from the data and decisions made based …

被引用次数：1141 相关文章所有 11 个版本

Analyzing the oversampling of different classes and types of examples in multi-class imbalanced datasets

JA Sáez, B Krawczyk, M Woźniak - Pattern Recognition, 2016 - Elsevier

Canonical machine learning algorithms assume that the number of objects in the considered
classes are roughly similar. However, in many real-life situations the distribution of examples …

被引用次数：282 相关文章所有 4 个版本

Computational intelligence techniques for medical diagnosis and prognosis: Problems and current developments

AH Shahid, MP Singh - Biocybernetics and Biomedical Engineering, 2019 - Elsevier

Diagnosis, being the first step in medical practice, is very crucial for clinical decision making.
This paper investigates state-of-the-art computational intelligence (CI) techniques applied in …

被引用次数：49 相关文章所有 5 个版本