Resampling imbalanced data for network intrusion detection datasets

S Bagui, K Li - Journal of Big Data, 2021 - Springer
Abstract Machine learning plays an increasingly significant role in the building of Network
Intrusion Detection Systems. However, machine learning models trained with imbalanced …

Adversarial approaches to tackle imbalanced data in machine learning

S Ayoub, Y Gulzar, J Rustamov, A Jabbari, FA Reegu… - Sustainability, 2023 - mdpi.com
Real-world applications often involve imbalanced datasets, which have different
distributions of examples across various classes. When building a system that requires a …

Synthetic minority oversampling technique for optimizing classification tasks in botnet and intrusion-detection-system datasets

D Gonzalez-Cuautle, A Hernandez-Suarez… - Applied Sciences, 2020 - mdpi.com
Presently, security is a hot research topic due to the impact in daily information infrastructure.
Machine-learning solutions have been improving classical detection practices, but detection …

An oversampling method for class imbalance problems on large datasets

F Rodríguez-Torres, JF Martínez-Trinidad… - Applied Sciences, 2022 - mdpi.com
Several oversampling methods have been proposed for solving the class imbalance
problem. However, most of them require searching the k-nearest neighbors to generate …

[HTML][HTML] Approx-SMOTE: fast SMOTE for big data on apache spark

M Juez-Gil, Á Arnaiz-González, JJ Rodriguez… - Neurocomputing, 2021 - Elsevier
One of the main goals of Big Data research, is to find new data mining methods that are able
to process large amounts of data in acceptable times. In Big Data classification, as in …

Feature selection for high-dimensional and imbalanced biomedical data based on robust correlation based redundancy and binary grasshopper optimization …

G Abdulrauf Sharifai, Z Zainol - Genes, 2020 - mdpi.com
The training machine learning algorithm from an imbalanced data set is an inherently
challenging task. It becomes more demanding with limited samples but with a massive …

A hybrid artificial intelligence algorithm for fault diagnosis of hot rolled strip crown imbalance

R Zhang, Y Qi, S Kong, X Wang, M Li - Engineering Applications of Artificial …, 2024 - Elsevier
In the production process of hot continuous rolling, due to the imbalance between the
number of normal cases and fault cases, the traditional supervised learning methods are …

[HTML][HTML] Experimental evaluation of ensemble classifiers for imbalance in big data

M Juez-Gil, Á Arnaiz-González, JJ Rodríguez… - Applied soft …, 2021 - Elsevier
Datasets are growing in size and complexity at a pace never seen before, forming ever
larger datasets known as Big Data. A common problem for classification, especially in Big …

Combination of PCA with SMOTE oversampling for classification of high-dimensional imbalanced data

GAA Mulla, Y Demir, M Hassan - Bitlis Eren Üniversitesi Fen …, 2021 - dergipark.org.tr
Dengesiz veri sınıflandırması, sınıflandırıcıların daha büyük veri sınıfına doğru çarpıtıldığı
veri madenciliğinde yaygın bir konudur. Yüksek boyutlu çarpık (dengesiz) verilerin …

An analysis of local and global solutions to address big data imbalanced classification: a case study with SMOTE preprocessing

MJ Basgall, W Hasperué, M Naiouf… - Cloud Computing and …, 2019 - Springer
Addressing the huge amount of data continuously generated is an important challenge in
the Machine Learning field. The need to adapt the traditional techniques or create new ones …