Text stemming: Approaches, applications, and challenges
Stemming is a process in which the variant word forms are mapped to their base form. It is
among the basic text pre-processing approaches used in Language Modeling, Natural …
among the basic text pre-processing approaches used in Language Modeling, Natural …
Unsupervised learning of morphology
H Hammarström, L Borin - Computational Linguistics, 2011 - direct.mit.edu
This article surveys work on Unsupervised Learning of Morphology. We define
Unsupervised Learning of Morphology as the problem of inducing a description (of some …
Unsupervised Learning of Morphology as the problem of inducing a description (of some …
[PDF][PDF] A comparative study of stemming algorithms
AG Jivani - Int. J. Comp. Tech. Appl, 2011 - kenbenoit.net
Stemming is a pre-processing step in Text Mining applications as well as a very common
requirement of Natural Language processing functions. In fact it is very important in most of …
requirement of Natural Language processing functions. In fact it is very important in most of …
Comparing apples to apple: The effects of stemmers on topic models
A Schofield, D Mimno - Transactions of the Association for …, 2016 - direct.mit.edu
Rule-based stemmers such as the Porter stemmer are frequently used to preprocess English
corpora for topic modeling. In this work, we train and evaluate topic models on a variety of …
corpora for topic modeling. In this work, we train and evaluate topic models on a variety of …
A survey of stemming algorithms in information retrieval.
Background: During the last fifty years, improved information retrieval techniques have
become necessary because of the huge amount of information people have available, which …
become necessary because of the huge amount of information people have available, which …
A systematic review of text stemming techniques
Stemming is a program that matches the morphological variants of the word to its root word.
Stemming is extensively used as a pre-processing tool in the field of natural language …
Stemming is extensively used as a pre-processing tool in the field of natural language …
Translation techniques in cross-language information retrieval
Cross-language information retrieval (CLIR) is an active sub-domain of information retrieval
(IR). Like IR, CLIR is centered on the search for documents and for information contained …
(IR). Like IR, CLIR is centered on the search for documents and for information contained …
[PDF][PDF] Automatic training of lemmatization rules that handle morphological changes in pre-, in-and suffixes alike
B Jongejan, H Dalianis - Proceedings of the Joint Conference of …, 2009 - aclanthology.org
We propose a method to automatically train lemmatization rules that handle prefix, infix and
suffix changes to generate the lemma from the full form of a word. We explain how the …
suffix changes to generate the lemma from the full form of a word. We explain how the …
A comprehensive survey on Indian regional language processing
In recent information explosion, contents in internet are multilingual and majority will be in
the form of natural languages. Processing of these natural languages for various language …
the form of natural languages. Processing of these natural languages for various language …
[PDF][PDF] DErivBase: Inducing and evaluating a derivational morphology resource for German
Derivational models are still an underresearched area in computational morphology. Even
for German, a rather resourcerich language, there is a lack of largecoverage derivational …
for German, a rather resourcerich language, there is a lack of largecoverage derivational …