Distributed representations of words and documents for discriminating similar languages

T Jauhiainen, M Lui, M Zampieri, T Baldwin… - Journal of Artificial …, 2019 - jair.org

Language identification (" LI") is the problem of determining the natural language that a
document or part thereof is written in. Automatic LI has been extensively researched for over …

被引用次数：253 相关文章所有 11 个版本

[PDF] aclanthology.org

Discriminating between similar languages and arabic dialect identification: A report on the third dsl shared task

S Malmasi, M Zampieri, N Ljubešić… - Proceedings of the …, 2016 - aclanthology.org

We present the results of the third edition of the Discriminating between Similar Languages
(DSL) shared task, which was organized as part of the VarDial'2016 workshop at …

被引用次数：213 相关文章所有 12 个版本

[PDF] upv.es

A systematic study of knowledge graph analysis for cross-language plagiarism detection

M Franco-Salvador, P Rosso… - Information Processing & …, 2016 - Elsevier

Cross-language plagiarism detection aims to detect plagiarised fragments of text among
documents in different languages. In this paper, we perform a systematic examination of …

被引用次数：124 相关文章所有 5 个版本

[PDF] aclanthology.org

[PDF][PDF] Overview of the DSL shared task 2015

M Zampieri, L Tan, N Ljubešić… - Proceedings of the …, 2015 - aclanthology.org

We present the results of the 2nd edition of the Discriminating between Similar Languages
(DSL) shared task, which was organized as part of the LT4VarDial'2015 workshop and …

被引用次数：122 相关文章所有 13 个版本

[PDF] arxiv.org

Discriminating similar languages: Evaluations and explorations

C Goutte, S Léger, S Malmasi, M Zampieri - arXiv preprint arXiv …, 2016 - arxiv.org

We present an analysis of the performance of machine learning classifiers on discriminating
between similar languages and language varieties. We carried out a number of experiments …

被引用次数：70 相关文章所有 10 个版本

[PDF] github.io

Application of the distributed document representation in the authorship attribution task for small corpora

JP Posadas-Durán, H Gómez-Adorno, G Sidorov… - Soft Computing, 2017 - Springer

Distributed word representation in a vector space (word embeddings) is a novel technique
that allows to represent words in terms of the elements in the neighborhood. Distributed …

被引用次数：60 相关文章所有 10 个版本

[PDF] arxiv.org

Uh-prhlt at semeval-2016 task 3: Combining lexical and semantic-based features for community question answering

M Franco-Salvador, S Kar, T Solorio… - arXiv preprint arXiv …, 2018 - arxiv.org

In this work we describe the system built for the three English subtasks of the SemEval 2016
Task 3 by the Department of Computer Science of the University of Houston (UH) and the …

被引用次数：63 相关文章所有 6 个版本

[PDF] aclanthology.org

When sparse traditional models outperform dense neural networks: the curious case of discriminating between similar languages

M Medvedeva, M Kroon, B Plank - … of the Fourth Workshop on NLP …, 2017 - aclanthology.org

We present the results of our participation in the VarDial 4 shared task on discriminating
closely related languages. Our submission includes simple traditional models using linear …

被引用次数：55 相关文章所有 8 个版本

[PDF] arxiv.org

A character-level convolutional neural network for distinguishing similar languages and dialects

Y Belinkov, J Glass - arXiv preprint arXiv:1609.07568, 2016 - arxiv.org

Discriminating between closely-related language varieties is considered a challenging and
important task. This paper describes our submission to the DSL 2016 shared-task, which …

被引用次数：46 相关文章所有 11 个版本

[PDF] aclanthology.org

Discriminating similar languages with linear SVMs and neural networks

Ç Çöltekin, T Rama - Proceedings of the Third Workshop on NLP …, 2016 - aclanthology.org

This paper describes the systems we experimented with for participating in the
discriminating between similar languages (DSL) shared task 2016. We submitted results of a …

被引用次数：44 相关文章所有 5 个版本