Critical survey of the freely available Arabic corpora

W Zaghouani - arXiv preprint arXiv:1702.07835, 2017 - arxiv.org
The availability of corpora is a major factor in building natural language processing
applications. However, the costs of acquiring corpora can prevent some researchers from …

Parsing models for identifying multiword expressions

S Green, MC de Marneffe, CD Manning - Computational Linguistics, 2013 - direct.mit.edu
Multiword expressions lie at the syntax/semantics interface and have motivated alternative
theories of syntax like Construction Grammar. Until now, however, syntactic analysis and …

Detection of verbal multi-word expressions via conditional random fields with syntactic dependency features and semantic re-ranking

A Maldonado, L Han, E Moreau… - Proceedings of the 13th …, 2017 - hal.science
A description of a system for identifying Verbal Multi-Word Expressions (VMWEs) in running
text is presented. The system mainly exploits universal syntactic dependency features …

A generic and open framework for multiword expressions treatment: from acquisition to applications

C Ramisch - 2012 - hal.science
The treatment of multiword expressions (MWEs), like take off, bus stop and big deal, is a
challenge for NLP applications. This kind of linguistic construction is not only arbitrary but …

[PDF][PDF] An ontology-based summarization system for arabic documents (ossad)

I Imam, N Nounou, A Hamouda… - International Journal of …, 2013 - academia.edu
With the problem of increased web resources and the huge amount of information available,
the necessity of having automatic summarization systems appeared. Since summarization is …

[PDF][PDF] Arabic nested noun compound extraction based on linguistic features and statistical measures

N Omar, Q Al-Tashi - GEMA Online Journal of Language Studies, 2018 - academia.edu
The extraction of Arabic nested noun compound is significant for several research areas
such as sentiment analysis, text summarization, word categorization, grammar checker, and …

[PDF][PDF] Light verb constructions in the SzegedParalellFX English-Hungarian parallel corpus

V Vincze - 2012 - lrec.elra.info
In this paper, we describe the first English–Hungarian parallel corpus annotated for light
verb constructions, which contains 14,261 sentence alignment units. Annotation principles …

Arabic Domain Terminology Extraction: A Literature Review: (Short Paper)

I Bounhas, W Lahbib, B Elayeb - On the Move to Meaningful Internet …, 2014 - Springer
Abstract Domain terminology extraction is an important step in many applications such as
ontology building and information retrieval. Analyzing a corpus to automatically extract key …

MWE-finder: Querying for multiword expressions in large Dutch text corpora

J Odijka, M Kroona, S Spoela, B Bonfila… - … expressions in lexical …, 2024 - books.google.com
We present MWE-Finder, an application that enables a user to search for multiword
expressions (MWEs) in large Dutch text corpora. Components of many MWEs in Dutch can …

[PDF][PDF] Semantic reranking of CRF label sequences for verbal multiword expression identification

E Moreau, A Alsulaimani, A Maldonado, L Han… - 2018 - library.oapen.org
Moreau, Alsulaimani, Maldonado, Han, Vogel & Dutta Chowdhury that of its constituent
words. This is why it uses semantic features based on comparing the context vector of a …