Systematic literature review of dialectal Arabic: identification and detection

A Elnagar, SM Yagi, AB Nassif, I Shahin… - IEEE …, 2021 - ieeexplore.ieee.org
It is becoming increasingly difficult to know who is working on what and how in
computational studies of Dialectal Arabic. This study comes to chart the field by conducting a …

Automatic language identification in texts: A survey

T Jauhiainen, M Lui, M Zampieri, T Baldwin… - Journal of Artificial …, 2019 - jair.org
Language identification (" LI") is the problem of determining the natural language that a
document or part thereof is written in. Automatic LI has been extensively researched for over …

Discriminating between similar languages and arabic dialect identification: A report on the third dsl shared task

S Malmasi, M Zampieri, N Ljubešić… - Proceedings of the …, 2016 - aclanthology.org
We present the results of the third edition of the Discriminating between Similar Languages
(DSL) shared task, which was organized as part of the VarDial'2016 workshop at …

[PDF][PDF] Language Identification and Morphosyntactic Tagging. The Second VarDial Evaluation Campaign.

M Zampieri, S Malmasi, P Nakov, A Ali, S Shon, J Glass… - 2018 - repository.ubn.ru.nl
We present the results and the findings of the Second VarDial Evaluation Campaign on
Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects. The …

Automated essay scoring with string kernels and word embeddings

M Cozma, AM Butnaru, RT Ionescu - arXiv preprint arXiv:1804.07954, 2018 - arxiv.org
In this work, we present an approach based on combining string kernels and word
embeddings for automatic essay scoring. String kernels capture the similarity among strings …

QADI: Arabic dialect identification in the wild

A Abdelali, H Mubarak, Y Samih… - Proceedings of the …, 2021 - aclanthology.org
Proper dialect identification is important for a variety of Arabic NLP applications. In this
paper, we present a method for rapidly constructing a tweet dataset containing a wide range …

Language variety identification with true labels

M Zampieri, K North, T Jauhiainen, M Felice… - arXiv preprint arXiv …, 2023 - arxiv.org
Language identification is an important first step in many IR and NLP applications. Most
publicly available language identification datasets, however, are compiled under the …

Arabic dialect identification in the wild

A Abdelali, H Mubarak, Y Samih, S Hassan… - arXiv preprint arXiv …, 2020 - arxiv.org
We present QADI, an automatically collected dataset of tweets belonging to a wide range of
country-level Arabic dialects-covering 18 different countries in the Middle East and North …

Learning to identify Arabic and German dialects using multiple kernels

RT Ionescu, A Butnaru - Proceedings of the fourth workshop on …, 2017 - aclanthology.org
We present a machine learning approach for the Arabic Dialect Identification (ADI) and the
German Dialect Identification (GDI) Closed Shared Tasks of the DSL 2017 Challenge. The …

Modeling global syntactic variation in English using dialect classification

J Dunn - arXiv preprint arXiv:1904.05527, 2019 - arxiv.org
This paper evaluates global-scale dialect identification for 14 national varieties of English as
a means for studying syntactic variation. The paper makes three main contributions:(i) …