[PDF][PDF] Ontology learning from text: An overview

P Buitelaar, P Cimiano, B Magnini - Ontology learning from text: Methods …, 2005 - Citeseer
This volume brings together a collection of extended versions of selected papers from two
workshops on ontology learning, knowledge acquisition and related topics that were …

[PDF][PDF] BootCaT: Bootstrapping Corpora and Terms from the Web.

M Baroni, S Bernardini - LREC, 2004 - marcobaroni.org
This paper introduces the BootCaT toolkit, a suite of perl programs implementing an iterative
procedure to bootstrap specialized corpora and terms from the web. The procedure requires …

[PDF][PDF] A language model approach to keyphrase extraction

T Tomokiyo, M Hurst - Proceedings of the ACL 2003 workshop on …, 2003 - aclanthology.org
We present a new approach to extracting keyphrases based on statistical language models.
Our approach is to use pointwise KL-divergence between multiple language models for …

[PDF][PDF] Blogpulse: Automated trend discovery for weblogs

N Glance, M Hurst, T Tomokiyo - WWW 2004 workshop on the …, 2004 - academia.edu
Over the past few years, weblogs have emerged as a new communication and publication
medium on the Internet. In this paper, we describe the application of data mining, information …

Determination of unithood and termhood for term recognition

W Wong - Handbook of research on text and web mining …, 2009 - igi-global.com
As more electronic text is readily available, and more applications become knowledge
intensive and ontology-enabled, term extraction, also known as automatic term recognition …

[PDF][PDF] Improving statistical machine translation using domain bilingual multiword expressions

Z Ren, Y Lü, J Cao, Q Liu, Y Huang - Proceedings of the Workshop …, 2009 - aclanthology.org
Multiword expressions (MWEs) have been proved useful for many natural language
processing tasks. However, how to use them to improve performance of statistical machine …

Towards the web of concepts: Extracting concepts from large datasets

A Parameswaran, H Garcia-Molina… - Proceedings of the VLDB …, 2010 - dl.acm.org
Concepts are sequences of words that represent real or imaginary entities or ideas that
users are interested in. As a first step towards building a web of concepts that will form the …

[PDF][PDF] A nonparametric method for extraction of candidate phrasal terms

P Deane - Proceedings of the 43rd Annual Meeting of the …, 2005 - aclanthology.org
This paper introduces a new method for identifying candidate phrasal terms (also known as
multiword units) which applies a nonparametric, rank-based heuristic measure. Evaluation …

Rule-based automatic multi-word term extraction and lemmatization

R Stanković, C Krstev, I Obradović, B Lazić, A Trtovac - LREC, 2016 - hal.science
In this paper we present a rule-based method for multi-word term extraction that relies on
extensive lexical resources in the form of electronic dictionaries and finite-state transducers …

SEWAR: A corpus-based N-gram approach for extracting semantically-related words from Arabic medical corpus

RH AlMahmoud, BH Hammo - Expert Systems with Applications, 2024 - Elsevier
Automatic aggregation of similar words into semantically related groups (or clusters) is of
interest to many natural language processing (NLP) applications. Extracting semantically …