Machine learning with oversampling and undersampling techniques: overview study and experimental results
R Mohammed, J Rawashdeh… - 2020 11th international …, 2020 - ieeexplore.ieee.org
Data imbalance in Machine Learning refers to an unequal distribution of classes within a
dataset. This issue is encountered mostly in classification tasks in which the distribution of …
dataset. This issue is encountered mostly in classification tasks in which the distribution of …
REFORMS: Consensus-based Recommendations for Machine-learning-based Science
Machine learning (ML) methods are proliferating in scientific research. However, the
adoption of these methods has been accompanied by failures of validity, reproducibility, and …
adoption of these methods has been accompanied by failures of validity, reproducibility, and …
Handling data irregularities in classification: Foundations, trends, and future challenges
Most of the traditional pattern classifiers assume their input data to be well-behaved in terms
of similar underlying class distributions, balanced size of classes, the presence of a full set of …
of similar underlying class distributions, balanced size of classes, the presence of a full set of …
Experimental perspectives on learning from imbalanced data
J Van Hulse, TM Khoshgoftaar… - Proceedings of the 24th …, 2007 - dl.acm.org
We present a comprehensive suite of experimentation on the subject of learning from
imbalanced data. When classes are imbalanced, many learning algorithms can suffer from …
imbalanced data. When classes are imbalanced, many learning algorithms can suffer from …
On the class imbalance problem
X Guo, Y Yin, C Dong, G Yang… - 2008 Fourth international …, 2008 - ieeexplore.ieee.org
The class imbalance problem has been recognized in many practical domains and a hot
topic of machine learning in recent years. In such a problem, almost all the examples are …
topic of machine learning in recent years. In such a problem, almost all the examples are …
[PDF][PDF] Balancing training data for automated annotation of keywords: a case study.
There has been an increasing interest in tools for automating the annotation of databases.
Machine learning techniques are promising candidates to help curators to, at least, guide …
Machine learning techniques are promising candidates to help curators to, at least, guide …
[PDF][PDF] Leave a reply: An analysis of weblog comments
G Mishne, N Glance - Third annual workshop on the …, 2006 - ambuehler.ethz.ch
Access to weblogs, both through commercial services and in academic studies, is usually
limited to the content of the weblog posts. This overlooks an important aspect distinguishing …
limited to the content of the weblog posts. This overlooks an important aspect distinguishing …
Preprocessing unbalanced data using support vector machine
MAH Farquad, I Bose - Decision Support Systems, 2012 - Elsevier
This paper deals with the application of support vector machine (SVM) to deal with the class
imbalance problem. The objective of this paper is to examine the feasibility and efficiency of …
imbalance problem. The objective of this paper is to examine the feasibility and efficiency of …
Characterizing and predicting blocking bugs in open source projects
H Valdivia Garcia, E Shihab - … of the 11th working conference on mining …, 2014 - dl.acm.org
As software becomes increasingly important, its quality becomes an increasingly important
issue. Therefore, prior work focused on software quality and proposed many prediction …
issue. Therefore, prior work focused on software quality and proposed many prediction …
[PDF][PDF] New algorithms for efficient high-dimensional nonparametric classification.
This paper is about non-approximate acceleration of high-dimensional nonparametric
operations such as k nearest neighbor classifiers. We attempt to exploit the fact that even if …
operations such as k nearest neighbor classifiers. We attempt to exploit the fact that even if …