Systematic literature review of dialectal Arabic: identification and detection
It is becoming increasingly difficult to know who is working on what and how in
computational studies of Dialectal Arabic. This study comes to chart the field by conducting a …
computational studies of Dialectal Arabic. This study comes to chart the field by conducting a …
Automatic language identification in texts: A survey
Language identification (" LI") is the problem of determining the natural language that a
document or part thereof is written in. Automatic LI has been extensively researched for over …
document or part thereof is written in. Automatic LI has been extensively researched for over …
Discriminating between similar languages and arabic dialect identification: A report on the third dsl shared task
We present the results of the third edition of the Discriminating between Similar Languages
(DSL) shared task, which was organized as part of the VarDial'2016 workshop at …
(DSL) shared task, which was organized as part of the VarDial'2016 workshop at …
[PDF][PDF] Language Identification and Morphosyntactic Tagging. The Second VarDial Evaluation Campaign.
We present the results and the findings of the Second VarDial Evaluation Campaign on
Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects. The …
Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects. The …
Automated essay scoring with string kernels and word embeddings
M Cozma, AM Butnaru, RT Ionescu - arXiv preprint arXiv:1804.07954, 2018 - arxiv.org
In this work, we present an approach based on combining string kernels and word
embeddings for automatic essay scoring. String kernels capture the similarity among strings …
embeddings for automatic essay scoring. String kernels capture the similarity among strings …
QADI: Arabic dialect identification in the wild
Proper dialect identification is important for a variety of Arabic NLP applications. In this
paper, we present a method for rapidly constructing a tweet dataset containing a wide range …
paper, we present a method for rapidly constructing a tweet dataset containing a wide range …
Language variety identification with true labels
Language identification is an important first step in many IR and NLP applications. Most
publicly available language identification datasets, however, are compiled under the …
publicly available language identification datasets, however, are compiled under the …
Arabic dialect identification in the wild
We present QADI, an automatically collected dataset of tweets belonging to a wide range of
country-level Arabic dialects-covering 18 different countries in the Middle East and North …
country-level Arabic dialects-covering 18 different countries in the Middle East and North …
Learning to identify Arabic and German dialects using multiple kernels
RT Ionescu, A Butnaru - Proceedings of the fourth workshop on …, 2017 - aclanthology.org
We present a machine learning approach for the Arabic Dialect Identification (ADI) and the
German Dialect Identification (GDI) Closed Shared Tasks of the DSL 2017 Challenge. The …
German Dialect Identification (GDI) Closed Shared Tasks of the DSL 2017 Challenge. The …
Modeling global syntactic variation in English using dialect classification
J Dunn - arXiv preprint arXiv:1904.05527, 2019 - arxiv.org
This paper evaluates global-scale dialect identification for 14 national varieties of English as
a means for studying syntactic variation. The paper makes three main contributions:(i) …
a means for studying syntactic variation. The paper makes three main contributions:(i) …