[HTML][HTML] Arabic natural language processing: An overview

I Guellil, H Saâdane, F Azouaou, B Gueni… - Journal of King Saud …, 2021 - Elsevier
Arabic is recognised as the 4th most used language of the Internet. Arabic has three main
varieties:(1) classical Arabic (CA),(2) Modern Standard Arabic (MSA),(3) Arabic Dialect (AD) …

The interplay of variant, size, and task type in Arabic pre-trained language models

G Inoue, B Alhafni, N Baimukan, H Bouamor… - arXiv preprint arXiv …, 2021 - arxiv.org
In this paper, we explore the effects of language variants, data sizes, and fine-tuning task
types in Arabic pre-trained language models. To do so, we build three pre-trained language …

[HTML][HTML] Freely available Arabic corpora: A scoping review

A Ahmed, N Ali, M Alzubaidi, W Zaghouani… - Computer Methods and …, 2022 - Elsevier
Background Corpora play a vital role when training machine learning (ML) models and
building systems that use natural language processing (NLP). It can be challenging for …

A panoramic survey of natural language processing in the Arab world

K Darwish, N Habash, M Abbas, H Al-Khalifa… - Communications of the …, 2021 - dl.acm.org
THE TERM NATURAL language refers to any system of symbolic communication (spoken,
signed, or written) that has evolved naturally in humans without intentional human planning …

Part-of-speech tagging for Arabic tweets using CRF and Bi-LSTM

W AlKhwiter, N Al-Twairesh - Computer Speech & Language, 2021 - Elsevier
Over the past few years, Twitter has experienced massive growth and the volume of its
online content has increased rapidly. This content has been a rich source for several studies …

Nabra: Syrian Arabic Dialects with Morphological Annotations

A Nayouf, T Hammouda, M Jarrar, F Zaraket… - arXiv preprint arXiv …, 2023 - arxiv.org
This paper presents Nabra, a corpora of Syrian Arabic dialects with morphological
annotations. A team of Syrian natives collected more than 6K sentences containing about …

Curras+ baladi: Towards a levantine corpus

KE Haff, M Jarrar, T Hammouda, F Zaraket - arXiv preprint arXiv …, 2022 - arxiv.org
The processing of the Arabic language is a complex field of research. This is due to many
factors, including the complex and rich morphology of Arabic, its high degree of ambiguity …

[PDF][PDF] Unified guidelines and resources for Arabic dialect orthography

N Habash, F Eryani, S Khalifa, O Rambow… - Proceedings of the …, 2018 - aclanthology.org
We present a unified set of guidelines and resources for conventional orthography of
dialectal Arabic. While Standard Arabic has well defined orthographic standards, none of the …

AraCust: a Saudi Telecom Tweets corpus for sentiment analysis

L Almuqren, A Cristea - PeerJ Computer Science, 2021 - peerj.com
Comparing Arabic to other languages, Arabic lacks large corpora for Natural Language
Processing (Assiri, Emam & Al-Dossari, 2018; Gamal et al., 2019). A number of scholars …

UniMorph 4.0: universal morphology

K Batsuren, O Goldman, S Khalifa, N Habash… - arXiv preprint arXiv …, 2022 - arxiv.org
The Universal Morphology (UniMorph) project is a collaborative effort providing broad-
coverage instantiated normalized morphological inflection tables for hundreds of diverse …