Measurement of text similarity: a survey

J Wang, Y Dong - Information, 2020 - mdpi.com
Text similarity measurement is the basis of natural language processing tasks, which play an
important role in information retrieval, automatic question answering, machine translation …

[HTML][HTML] A recent overview of the state-of-the-art elements of text classification

MM Mirończuk, J Protasiewicz - Expert Systems with Applications, 2018 - Elsevier
The aim of this study is to provide an overview the state-of-the-art elements of text
classification. For this purpose, we first select and investigate the primary and recent studies …

Semantic text classification: A survey of past and recent advances

B Altınel, MC Ganiz - Information Processing & Management, 2018 - Elsevier
Automatic text classification is the task of organizing documents into pre-determined classes,
generally using machine learning algorithms. Generally speaking, it is one of the most …

Approaches to automated detection of cyberbullying: A survey

S Salawu, Y He, J Lumsden - IEEE Transactions on Affective …, 2017 - ieeexplore.ieee.org
Research into cyberbullying detection has increased in recent years, due in part to the
proliferation of cyberbullying across social media and its detrimental effect on young people …

Sustainable bioethanol production from first-and second-generation sugar-based feedstocks: Advanced bibliometric analysis

CEC Guimarães, FS Neto, V de Castro Bizerra… - Bioresource Technology …, 2023 - Elsevier
Bioethanol is produced from carbohydrate-containing feedstocks through fermentation.
Based on a bibliometric review of studies published between 2012 and 2021, we analyzed …

An empirical comparison of four text mining methods

S Lee, J Song, Y Kim - Journal of Computer Information Systems, 2010 - Taylor & Francis
The amount of textual data that is available for researchers and businesses to analyze is
increasing at a dramatic rate. This reality has led IS researchers to investigate various text …

Detecting cyberbullying: query terms and techniques

A Kontostathis, K Reynolds, A Garron… - Proceedings of the 5th …, 2013 - dl.acm.org
In this paper we describe a close analysis of the language used in cyberbullying. We take as
our corpus a collection of posts from Formspring. me. Formspring. me is a social networking …

An approach to source-code plagiarism detection and investigation using latent semantic analysis

G Cosma, M Joy - IEEE transactions on computers, 2011 - ieeexplore.ieee.org
Plagiarism is a growing problem in academia. Academics often use plagiarism detection
tools to detect similar source-code files. Once similar files are detected, the academic …

Quantitative approaches to content analysis: Identifying conceptual drift across publication outlets

M Indulska, DS Hovorka, J Recker - European Journal of …, 2012 - Taylor & Francis
Unstructured text data, such as emails, blogs, contracts, academic publications,
organizational documents, transcribed interviews, and even tweets, are important sources of …

An empirical study of required dimensionality for large-scale latent semantic indexing applications

RB Bradford - Proceedings of the 17th ACM conference on …, 2008 - dl.acm.org
The technique of latent semantic indexing is used in a wide variety of commercial
applications. In these applications, the processing time and RAM required for SVD …