Surveying stylometry techniques and applications

T Neal, K Sundararajan, A Fatima, Y Yan… - ACM Computing …, 2017 - dl.acm.org
The analysis of authorial style, termed stylometry, assumes that style is quantifiably
measurable for evaluation of distinctive qualities. Stylometry research has yielded several …

Principal component analysis: A natural approach to data exploration

FL Gewers, GR Ferreira, HFD Arruda, FN Silva… - ACM Computing …, 2021 - dl.acm.org
Principal component analysis (PCA) is often applied for analyzing data in the most diverse
areas. This work reports, in an accessible and integrated manner, several theoretical and …

Two feature weighting approaches for naive Bayes text classifiers

L Zhang, L Jiang, C Li, G Kong - Knowledge-Based Systems, 2016 - Elsevier
This paper works on feature weighting approaches for naive Bayes text classifiers. Almost all
existing feature weighting approaches for naive Bayes text classifiers have some defects …

Sparse group lasso and high dimensional multinomial classification

M Vincent, NR Hansen - Computational Statistics & Data Analysis, 2014 - Elsevier
The sparse group lasso optimization problem is solved using a coordinate gradient descent
algorithm. The algorithm is applicable to a broad class of convex loss functions …

Gradient-based kernel dimension reduction for regression

K Fukumizu, C Leng - Journal of the American Statistical …, 2014 - Taylor & Francis
This article proposes a novel approach to linear dimension reduction for regression using
nonparametric estimation with positive-definite kernels or reproducing kernel Hilbert spaces …

A distributed approach for high-dimensionality heterogeneous data reduction

RM Gahar, O Arfaoui, MS Hidri… - IEEE Access, 2019 - ieeexplore.ieee.org
The recent explosion of data size in number of records and attributes has triggered the
development of a number of Big Data analytics as well as parallel data processing methods …

Exploring bias in gan-based data augmentation for small samples

M Hu, J Li - arXiv preprint arXiv:1905.08495, 2019 - arxiv.org
For machine learning task, lacking sufficient samples mean the trained model has low
confidence to approach the ground truth function. Until recently, after the generative …

Dense adaptive cascade forest: a self-adaptive deep ensemble for classification problems

H Wang, Y Tang, Z Jia, F Ye - Soft Computing, 2020 - Springer
Recent researches have shown that deep forest ensemble achieves a considerable
increase in classification accuracy compared with the general ensemble learning methods …

Modern synergetic neural network for imbalanced small data classification

Z Wang, H Li, L Ma - Scientific Reports, 2023 - nature.com
Deep learning's performance on the imbalanced small data is substantially degraded by
overfitting. Recurrent neural networks retain better performance in such tasks by …

Gradient-based kernel method for feature extraction and variable selection

K Fukumizu, C Leng - Advances in neural information …, 2012 - proceedings.neurips.cc
We propose a novel kernel approach to dimension reduction for supervised learning: feature
extraction and variable selection; the former constructs a small number of features from …