The Saudi novel corpus: Design and compilation
Arabic has recently received significant attention from corpus compilers. This situation has
led to the creation of many Arabic corpora that cover various genres, most notably the …
led to the creation of many Arabic corpora that cover various genres, most notably the …
Hierarchical aggregation of dialectal data for Arabic dialect identification
Arabic is a collection of dialectal variants that are historically related but significantly
different. These differences can be seen across regions, countries, and even cities in the …
different. These differences can be seen across regions, countries, and even cities in the …
An incremental approach to corpus design and construction: application to a large contemporary saudi corpus
Due to the rapid developments in technology and the sudden expansion of social media
use, Dialect Arabic has become an important source of data that needs to be addressed …
use, Dialect Arabic has become an important source of data that needs to be addressed …
Is Arabic punctuation rule-governed?
This paper investigates the extent to which Arabic punctuation is rule-governed, with the aim
of improving text comprehension, disambiguation, and machine translation. The study …
of improving text comprehension, disambiguation, and machine translation. The study …
Maknuune: A Large Open Palestinian Arabic Lexicon
We present Maknuune, a large open lexicon for the Palestinian Arabic dialect. Maknuune
has over 36K entries from 17K lemmas, and 3.7 K roots. All entries include diacritized Arabic …
has over 36K entries from 17K lemmas, and 3.7 K roots. All entries include diacritized Arabic …
[HTML][HTML] Morphologically-analyzed and syntactically-annotated Quran dataset
This paper introduces the Morphologically-Analyzed and Syntactically-Annotated Quran
(MASAQ) dataset, a comprehensive resource designed to address the scarcity of annotated …
(MASAQ) dataset, a comprehensive resource designed to address the scarcity of annotated …
Towards Gulf Emirati Dialect Corpus from Social Media
Purpose: This paper discusses the need for a corpus of Emirati traditional phrases and
idioms in natural language processing (NLP) for the Gulf Emirati dialect and its potential …
idioms in natural language processing (NLP) for the Gulf Emirati dialect and its potential …
Aggregating Hierarchical Dialectal Data for Arabic Dialect Classification
Arabic is a collection of dialectal variants that are historically related but significantly
different. These differences can be seen across regions, countries, and even cities in the …
different. These differences can be seen across regions, countries, and even cities in the …