Text stemming: Approaches, applications, and challenges

J Singh, V Gupta - ACM Computing Surveys (CSUR), 2016 - dl.acm.org
Stemming is a process in which the variant word forms are mapped to their base form. It is
among the basic text pre-processing approaches used in Language Modeling, Natural …

Unsupervised learning of morphology

H Hammarström, L Borin - Computational Linguistics, 2011 - direct.mit.edu
This article surveys work on Unsupervised Learning of Morphology. We define
Unsupervised Learning of Morphology as the problem of inducing a description (of some …

[PDF][PDF] A comparative study of stemming algorithms

AG Jivani - Int. J. Comp. Tech. Appl, 2011 - kenbenoit.net
Stemming is a pre-processing step in Text Mining applications as well as a very common
requirement of Natural Language processing functions. In fact it is very important in most of …

Comparing apples to apple: The effects of stemmers on topic models

A Schofield, D Mimno - Transactions of the Association for …, 2016 - direct.mit.edu
Rule-based stemmers such as the Porter stemmer are frequently used to preprocess English
corpora for topic modeling. In this work, we train and evaluate topic models on a variety of …

A survey of stemming algorithms in information retrieval.

C Moral, A de Antonio, R Imbert, J Ramírez - Information Research: An …, 2014 - ERIC
Background: During the last fifty years, improved information retrieval techniques have
become necessary because of the huge amount of information people have available, which …

A systematic review of text stemming techniques

J Singh, V Gupta - Artificial Intelligence Review, 2017 - Springer
Stemming is a program that matches the morphological variants of the word to its root word.
Stemming is extensively used as a pre-processing tool in the field of natural language …

Translation techniques in cross-language information retrieval

D Zhou, M Truran, T Brailsford, V Wade… - ACM Computing …, 2012 - dl.acm.org
Cross-language information retrieval (CLIR) is an active sub-domain of information retrieval
(IR). Like IR, CLIR is centered on the search for documents and for information contained …

[PDF][PDF] Automatic training of lemmatization rules that handle morphological changes in pre-, in-and suffixes alike

B Jongejan, H Dalianis - Proceedings of the Joint Conference of …, 2009 - aclanthology.org
We propose a method to automatically train lemmatization rules that handle prefix, infix and
suffix changes to generate the lemma from the full form of a word. We explain how the …

A comprehensive survey on Indian regional language processing

BS Harish, RK Rangan - SN Applied Sciences, 2020 - Springer
In recent information explosion, contents in internet are multilingual and majority will be in
the form of natural languages. Processing of these natural languages for various language …

[PDF][PDF] DErivBase: Inducing and evaluating a derivational morphology resource for German

B Zeller, J Šnajder, S Padó - … of the 51st annual meeting of the …, 2013 - aclanthology.org
Derivational models are still an underresearched area in computational morphology. Even
for German, a rather resourcerich language, there is a lack of largecoverage derivational …