The impact of preprocessing on text classification

AK Uysal, S Gunal - Information processing & management, 2014 - Elsevier
Preprocessing is one of the key components in a typical text classification framework. This
paper aims to extensively examine the impact of preprocessing on text classification in terms …

[图书][B] Handbook of natural language processing

N Indurkhya, FJ Damerau - 2010 - taylorfrancis.com
The Handbook of Natural Language Processing, Second Edition presents practical tools
and techniques for implementing natural language processing in computer systems. Along …

Impact of tokenization on language models: An analysis for turkish

C Toraman, EH Yilmaz, F Şahinuç… - ACM Transactions on …, 2023 - dl.acm.org
Tokenization is an important text preprocessing step to prepare input tokens for deep
language models. WordPiece and BPE are de facto methods employed by important …

[HTML][HTML] Is text preprocessing still worth the time? A comparative survey on the influence of popular preprocessing methods on Transformers and traditional classifiers

M Siino, I Tinnirello, M La Cascia - Information Systems, 2024 - Elsevier
With the advent of the modern pre-trained Transformers, the text preprocessing has started
to be neglected and not specifically addressed in recent NLP literature. However, both from …

TTC-3600: A new benchmark dataset for Turkish text categorization

D Kılınç, A Özçift, F Bozyigit, P Yıldırım… - Journal of …, 2017 - journals.sagepub.com
Owing to the rapid growth of the World Wide Web, the number of documents that can be
accessed via the Internet explosively increases with each passing day. Considering news …

Deep sentiment analysis: a case study on stemmed Turkish twitter data

HA Shehu, MH Sharif, MHU Sharif, R Datta… - IEEE …, 2021 - ieeexplore.ieee.org
Sentiment analysis using stemmed Twitter data from various languages is an emerging
research topic. In this paper, we address three data augmentation techniques namely Shift …

[PDF][PDF] KNN algoritması ve r dili ile metin madenciliği kullanılarak bilimsel makale tasnifi

D Kılınç, E Borandağ, F Yücalar, V Tunalı… - Marmara Fen Bilimleri …, 2016 - dergipark.org.tr
Metin tabanlı veri setleri üzerinde analiz işlemi gerçekleştirebilmek için Veri Madenciliğinin
alt alanı olan Metin Madenciliği (MM) alanındaki teknik ve yöntemler kullanılmaktadır. Bu …

Analysis of preprocessing methods on classification of Turkish texts

D Torunoğlu, E Çakirman, MC Ganiz… - … on Innovations in …, 2011 - ieeexplore.ieee.org
Preprocessing is an important task and critical step in information retrieval and text mining.
The objective of this study is to analyze the effect of preprocessing methods in text …

Emotion analysis from Turkish tweets using deep neural networks

MA Tocoglu, O Ozturkmenoglu, A Alpkocak - IEEE Access, 2019 - ieeexplore.ieee.org
Text data analysis of social media is becoming more and more important since it includes
the most recent information on what people think about. Likewise, emotion is one of the most …

The impact of feature extraction and selection on SMS spam filtering

AK Uysal, S Gunal, S Ergin, ES Gunal - Elektronika ir Elektrotechnika, 2013 - eejournal.ktu.lt
This paper investigates the impact of several feature extraction and feature selection
approaches on filtering of short message service (SMS) spam messages in two different …