[PDF][PDF] Text classification using machine learning techniques.

M Ikonomakis, S Kotsiantis, V Tampakas - WSEAS transactions on …, 2005 - Citeseer
Automated text classification has been considered as a vital method to manage and process
a vast amount of documents in digital forms that are widespread and continuously …

Genetic algorithms in feature and instance selection

CF Tsai, W Eberle, CY Chu - Knowledge-Based Systems, 2013 - Elsevier
Feature selection and instance selection are two important data preprocessing steps in data
mining, where the former is aimed at removing some irrelevant and/or redundant features …

On strategies for imbalanced text classification using SVM: A comparative study

A Sun, EP Lim, Y Liu - Decision Support Systems, 2009 - Elsevier
Many real-world text classification tasks involve imbalanced training examples. The
strategies proposed to address the imbalanced classification (eg, resampling, instance …

Instance and feature selection using fuzzy rough sets: a bi-selection approach for data reduction

X Zhang, C Mei, J Li, Y Yang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Data reduction, aiming to reduce the original data by selecting the most representative
information, is an important technique of preprocessing data. At present, large-scale or huge …

Feature selection methods for text classification

A Dasgupta, P Drineas, B Harb, V Josifovski… - Proceedings of the 13th …, 2007 - dl.acm.org
We consider feature selection for text classification both theoretically and empirically. Our
main result is an unsupervised feature selection strategy for which we give worst-case …

Data classification methods using machine learning techniques

MAR Schmidtler, R Borrey - US Patent 7,937,345, 2011 - Google Patents
A method for adapting to a shift in document content according to one embodiment of the
present invention includes receiving at least one labeled seed document; receiving …

Data classification methods using machine learning techniques

MAR Schmidtler, R Borrey, A Sarah - US Patent 7,958,067, 2011 - Google Patents
Methods for classifying documents are presented. Methods for analyzing documents
associated with legal discovery are also presented. Methods for cleaning up data are also …

An intuitionistic fuzzy bireduct model and its application to cancer treatment

P Jain, AK Tiwari, T Som - Computers & Industrial Engineering, 2022 - Elsevier
Due to technological advancement, data size has seen a significant increase both in terms
of features and instances. An efficient way to handle large sized datasets is to apply data …

[PDF][PDF] Text categorization and machine learning methods: current state of the art

DB Dasari, VG Rao - Global Journal of Computer Science and …, 2012 - academia.edu
In this informative age, we find many documents are available in digital forms which need
classification of the text. For solving this major problem present researchers focused on …

Variational learning of a Dirichlet process of generalized Dirichlet distributions for simultaneous clustering and feature selection

W Fan, N Bouguila - Pattern Recognition, 2013 - Elsevier
This paper introduces a novel enhancement for unsupervised feature selection based on
generalized Dirichlet (GD) mixture models. Our proposal is based on the extension of the …