A statistical corpus-based term extractor

P Buitelaar, P Cimiano, B Magnini - Ontology learning from text: Methods …, 2005 - Citeseer

This volume brings together a collection of extended versions of selected papers from two
workshops on ontology learning, knowledge acquisition and related topics that were …

被引用次数：471 相关文章所有 7 个版本

[PDF] marcobaroni.org

[PDF][PDF] BootCaT: Bootstrapping Corpora and Terms from the Web.

M Baroni, S Bernardini - LREC, 2004 - marcobaroni.org

This paper introduces the BootCaT toolkit, a suite of perl programs implementing an iterative
procedure to bootstrap specialized corpora and terms from the web. The procedure requires …

被引用次数：699 相关文章所有 14 个版本

[PDF] aclanthology.org

[PDF][PDF] A language model approach to keyphrase extraction

T Tomokiyo, M Hurst - Proceedings of the ACL 2003 workshop on …, 2003 - aclanthology.org

We present a new approach to extracting keyphrases based on statistical language models.
Our approach is to use pointwise KL-divergence between multiple language models for …

被引用次数：421 相关文章所有 9 个版本

[PDF] academia.edu

[PDF][PDF] Blogpulse: Automated trend discovery for weblogs

N Glance, M Hurst, T Tomokiyo - WWW 2004 workshop on the …, 2004 - academia.edu

Over the past few years, weblogs have emerged as a new communication and publication
medium on the Internet. In this paper, we describe the application of data mining, information …

被引用次数：348 相关文章所有 6 个版本

[PDF] archive.org

Determination of unithood and termhood for term recognition

W Wong - Handbook of research on text and web mining …, 2009 - igi-global.com

As more electronic text is readily available, and more applications become knowledge
intensive and ontology-enabled, term extraction, also known as automatic term recognition …

被引用次数：56 相关文章所有 4 个版本

[PDF] aclanthology.org

[PDF][PDF] Improving statistical machine translation using domain bilingual multiword expressions

Z Ren, Y Lü, J Cao, Q Liu, Y Huang - Proceedings of the Workshop …, 2009 - aclanthology.org

Multiword expressions (MWEs) have been proved useful for many natural language
processing tasks. However, how to use them to improve performance of statistical machine …

被引用次数：125 相关文章所有 18 个版本

[PDF] stanford.edu

Towards the web of concepts: Extracting concepts from large datasets

A Parameswaran, H Garcia-Molina… - Proceedings of the VLDB …, 2010 - dl.acm.org

Concepts are sequences of words that represent real or imaginary entities or ideas that
users are interested in. As a first step towards building a web of concepts that will form the …

被引用次数：109 相关文章所有 10 个版本

[PDF] aclanthology.org

[PDF][PDF] A nonparametric method for extraction of candidate phrasal terms

P Deane - Proceedings of the 43rd Annual Meeting of the …, 2005 - aclanthology.org

This paper introduces a new method for identifying candidate phrasal terms (also known as
multiword units) which applies a nonparametric, rank-based heuristic measure. Evaluation …

被引用次数：109 相关文章所有 14 个版本

[PDF] hal.science

Rule-based automatic multi-word term extraction and lemmatization

R Stanković, C Krstev, I Obradović, B Lazić, A Trtovac - LREC, 2016 - hal.science

In this paper we present a rule-based method for multi-word term extraction that relies on
extensive lexical resources in the form of electronic dictionaries and finite-state transducers …

被引用次数：56 相关文章所有 18 个版本

SEWAR: A corpus-based N-gram approach for extracting semantically-related words from Arabic medical corpus

RH AlMahmoud, BH Hammo - Expert Systems with Applications, 2024 - Elsevier

Automatic aggregation of similar words into semantically related groups (or clusters) is of
interest to many natural language processing (NLP) applications. Extracting semantically …

被引用次数：3 相关文章所有 2 个版本