[HTML][HTML] A recent overview of the state-of-the-art elements of text classification

MM Mirończuk, J Protasiewicz - Expert Systems with Applications, 2018 - Elsevier
The aim of this study is to provide an overview the state-of-the-art elements of text
classification. For this purpose, we first select and investigate the primary and recent studies …

Hybrid embedding-based text representation for hierarchical multi-label text classification

Y Ma, X Liu, L Zhao, Y Liang, P Zhang, B Jin - Expert Systems with …, 2022 - Elsevier
Many real-world text classification tasks often deal with a large number of closely related
categories organized in a hierarchical structure or taxonomy. Hierarchical multi-label text …

Improving multi-label text classification using weighted information gain and co-trained Multinomial Naive Bayes classifier

W Kaur, V Balakrishnan, KS Wong - Malaysian Journal of …, 2022 - mojem.um.edu.my
Over recent years, the emergence of electronic text processing systems has generated a
vast amount of structured and unstructured data, thus creating a challenging situation for …

Research on the improved Word2Vec optimization strategy based on statistical language model

S Lei - 2020 international conference on information science …, 2020 - ieeexplore.ieee.org
In order to improve the text matching degree and calculation accuracy of the short text
classification method, this paper studies the optimization of the short text classification …

[Retracted] Construction of Machine Learning Model Based on Text Mining and Ranking of Meituan Merchants

Y Tang, D Liao, S Huang, Q Fan… - Scientific Programming, 2021 - Wiley Online Library
In the Web 2.0 era, the problem of uneven quality and overload of online reviews is very
serious, and the cognitive cost of obtaining valuable content from them is getting higher and …

Alternatives to Classic BM25-IDF based on a New Information Theoretical Framework

W Ke - 2022 IEEE International Conference on Big Data (Big …, 2022 - ieeexplore.ieee.org
The IDF (Inverse Document Frequency) term weighting method is a classic treatment of a
term's significance in information retrieval and text analytics. IDF can be derived from the …

A study on the relationship between class similarity and the performance of hierarchical classification method in a text document classification problem

S Jang, D Min - Journal of Society for e-Business Studies, 2022 - calsec.or.kr
The literature has reported that hierarchical classification methods generally outperform the
flat classification methods for a multi-class document classification problem. Unlike the …

Smart Governance Tool's Design to Monitor the Commitments of Bio-Business Licensing in Indonesia

MM Maulana, AI Suroso, Y Nurhadryani, KB Seminar - 2024 - preprints.org
Some business license commitments in online single submission (OSS) currently only
consist of an independent statement from the business actor, and there is no time limit for …

A feature selection method for classifying highly similar text documents

J Kim, D Min - Industrial Engineering & Management Systems, 2021 - dbpia.co.kr
In the era of big data, the importance of data classification is increasing. However, when it
comes to classifying text documents, several obstacles degrade classification performance …

On Triangular Inequality of the Discounted Least Information Theory of Entropy (DLITE)

KS Umare, W Ke - arXiv preprint arXiv:2210.08079, 2022 - arxiv.org
The Discounted Least Information Theory of Entropy (DLITE) is a new information measure
that quantifies the amount of entropic difference between two probability distributions. It …