Automatic language identification in texts: A survey
Language identification (" LI") is the problem of determining the natural language that a
document or part thereof is written in. Automatic LI has been extensively researched for over …
document or part thereof is written in. Automatic LI has been extensively researched for over …
Discriminating between similar languages and arabic dialect identification: A report on the third dsl shared task
We present the results of the third edition of the Discriminating between Similar Languages
(DSL) shared task, which was organized as part of the VarDial'2016 workshop at …
(DSL) shared task, which was organized as part of the VarDial'2016 workshop at …
A systematic study of knowledge graph analysis for cross-language plagiarism detection
M Franco-Salvador, P Rosso… - Information Processing & …, 2016 - Elsevier
Cross-language plagiarism detection aims to detect plagiarised fragments of text among
documents in different languages. In this paper, we perform a systematic examination of …
documents in different languages. In this paper, we perform a systematic examination of …
[PDF][PDF] Overview of the DSL shared task 2015
We present the results of the 2nd edition of the Discriminating between Similar Languages
(DSL) shared task, which was organized as part of the LT4VarDial'2015 workshop and …
(DSL) shared task, which was organized as part of the LT4VarDial'2015 workshop and …
Discriminating similar languages: Evaluations and explorations
We present an analysis of the performance of machine learning classifiers on discriminating
between similar languages and language varieties. We carried out a number of experiments …
between similar languages and language varieties. We carried out a number of experiments …
Application of the distributed document representation in the authorship attribution task for small corpora
Distributed word representation in a vector space (word embeddings) is a novel technique
that allows to represent words in terms of the elements in the neighborhood. Distributed …
that allows to represent words in terms of the elements in the neighborhood. Distributed …
Uh-prhlt at semeval-2016 task 3: Combining lexical and semantic-based features for community question answering
M Franco-Salvador, S Kar, T Solorio… - arXiv preprint arXiv …, 2018 - arxiv.org
In this work we describe the system built for the three English subtasks of the SemEval 2016
Task 3 by the Department of Computer Science of the University of Houston (UH) and the …
Task 3 by the Department of Computer Science of the University of Houston (UH) and the …
When sparse traditional models outperform dense neural networks: the curious case of discriminating between similar languages
We present the results of our participation in the VarDial 4 shared task on discriminating
closely related languages. Our submission includes simple traditional models using linear …
closely related languages. Our submission includes simple traditional models using linear …
A character-level convolutional neural network for distinguishing similar languages and dialects
Y Belinkov, J Glass - arXiv preprint arXiv:1609.07568, 2016 - arxiv.org
Discriminating between closely-related language varieties is considered a challenging and
important task. This paper describes our submission to the DSL 2016 shared-task, which …
important task. This paper describes our submission to the DSL 2016 shared-task, which …
Discriminating similar languages with linear SVMs and neural networks
Ç Çöltekin, T Rama - Proceedings of the Third Workshop on NLP …, 2016 - aclanthology.org
This paper describes the systems we experimented with for participating in the
discriminating between similar languages (DSL) shared task 2016. We submitted results of a …
discriminating between similar languages (DSL) shared task 2016. We submitted results of a …