Stemming and its effects on TFIDF ranking

C Carpineto, S Osiński, G Romano… - ACM Computing Surveys …, 2009 - dl.acm.org

Web clustering engines organize search results by topic, thus offering a complementary
view to the flat-ranked list returned by conventional search engines. In this survey, we …

被引用次数：529 相关文章所有 9 个版本

[PDF] uva.nl

A study of stemming effects on information retrieval in Bahasa Indonesia

F Tala - 2003 - eprints.illc.uva.nl

Stemming is a process which provides a mapping of different morphological variants of
words into their base/common word (stem). This process is also known as conflation. Based …

被引用次数：619 相关文章所有 7 个版本

[PDF] umich.edu

Automated duplicate detection for bug tracking systems

N Jalbert, W Weimer - … Systems and Networks With FTCS and …, 2008 - ieeexplore.ieee.org

Bug tracking systems are important tools that guide the maintenance activities of software
developers. The utility of these systems is hampered by an excessive number of duplicate …

被引用次数：406 相关文章所有 10 个版本

[PDF] springer.com

TeKET: a Tree-Based Unsupervised Keyphrase Extraction Technique

G Rabby, S Azad, M Mahmud, KZ Zamli… - Cognitive …, 2020 - Springer

Automatic keyphrase extraction techniques aim to extract quality keyphrases for higher level
summarization of a document. Majority of the existing techniques are mainly domain …

被引用次数：97 相关文章所有 7 个版本

[PDF] ieee.org

Citation intent classification using word embedding

M Roman, A Shahid, S Khan, A Koubaa, L Yu - Ieee Access, 2021 - ieeexplore.ieee.org

Citation analysis is an active area of research for various reasons. So far, statistical
approaches are mainly used for citation analysis, which does not look into the internal …

被引用次数：68 相关文章所有 5 个版本

Comments mining with TF-IDF: the inherent bias and its removal

I Yahav, O Shehory, D Schwartz - IEEE Transactions on …, 2018 - ieeexplore.ieee.org

Text mining have gained great momentum in recent years, with user-generated content
becoming widely available. One key use is comment mining, with much attention being …

被引用次数：117 相关文章所有 3 个版本

[PDF] dlr.de

Sentence retrieval using stemming and lemmatization with different length of the queries

I Boban, A Doko, S Gotovac - Advances in Science, Technology and …, 2020 - elib.dlr.de

In this paper we focus on Sentence retrieval which is similar to Document retrieval but with a
smaller unit of retrieval. Using data pre-processing in document retrieval is generally …

被引用次数：37 相关文章所有 5 个版本

[PDF] periodicosibepes.org.br

A tecnologia de mineração de textos

C Aranha, E Passos - Revista Eletrônica de Sistemas de …, 2006 - periodicosibepes.org.br

Mineração de textos, também conhecido como mineração de dados textuais ou descoberta
de conhecimento de bases de dados textuais, em geral, se refere ao processo de extração …

被引用次数：111 相关文章所有 6 个版本

[PDF] academia.edu

A novel approach for initializing the spherical K-means clustering algorithm

R Duwairi, M Abu-Rahmeh - Simulation Modelling Practice and Theory, 2015 - Elsevier

In this paper, a novel approach for initializing the spherical K-means algorithm is proposed.
It is based on calculating well distributed seeds across the input space. Also, a new measure …

被引用次数：72 相关文章所有 3 个版本

[PDF] researchgate.net

Elsevier journal finder: recommending journals for your paper

N Kang, MA Doornenbal… - Proceedings of the 9th …, 2015 - dl.acm.org

Rejection is the norm in academic publishing. One of the main reasons for rejections is that
the topics of the submitted papers are not relevant to the scope of the journal, even when the …

被引用次数：67 相关文章所有 2 个版本