On the features of translationese
V Volansky, N Ordan, S Wintner - Digital Scholarship in the …, 2015 - academic.oup.com
Much research in translation studies indicates that translated texts are ontologically different
from original non-translated ones. Translated texts, in any language, can be considered a …
from original non-translated ones. Translated texts, in any language, can be considered a …
[PDF][PDF] Can characters reveal your native language? A language-independent approach to native language identification
A common approach in text mining tasks such as text categorization, authorship
identification or plagiarism detection is to rely on features like words, part-of-speech tags …
identification or plagiarism detection is to rely on features like words, part-of-speech tags …
How human is machine translationese? comparing human and machine translations of text and speech
Translationese is a phenomenon present in human translations, simultaneous interpreting,
and even machine translations. Some translationese features tend to appear in …
and even machine translations. Some translationese features tend to appear in …
Unsupervised identification of translationese
E Rabinovich, S Wintner - Transactions of the Association for …, 2015 - direct.mit.edu
Translated texts are distinctively different from original ones, to the extent that supervised
text classification methods can distinguish between them with high accuracy. These …
text classification methods can distinguish between them with high accuracy. These …
Learning to identify Arabic and German dialects using multiple kernels
RT Ionescu, A Butnaru - Proceedings of the fourth workshop on …, 2017 - aclanthology.org
We present a machine learning approach for the Arabic Dialect Identification (ADI) and the
German Dialect Identification (GDI) Closed Shared Tasks of the DSL 2017 Challenge. The …
German Dialect Identification (GDI) Closed Shared Tasks of the DSL 2017 Challenge. The …
String kernels for native language identification: Insights from behind the curtains
The most common approach in text mining classification tasks is to rely on features like
words, part-of-speech tags, stems, or some other high-level linguistic features. Recently, an …
words, part-of-speech tags, stems, or some other high-level linguistic features. Recently, an …
[PDF][PDF] Automatic detection of machine translated text and translation quality estimation
We show that it is possible to automatically detect machine translated text at sentence level
from monolingual corpora, using text classification methods. We show further that the …
from monolingual corpora, using text classification methods. We show further that the …
Comparing feature-engineering and feature-learning approaches for multilingual translationese classification
D Pylypenko, K Amponsah-Kaakyire… - arXiv preprint arXiv …, 2021 - arxiv.org
Traditional hand-crafted linguistically-informed features have often been used for
distinguishing between translated and original non-translated texts. By contrast, to date …
distinguishing between translated and original non-translated texts. By contrast, to date …
[PDF][PDF] Kernel Methods and String Kernels for Authorship Analysis.
This paper presents our approach to the PAN 2012 Traditional Authorship Attribution tasks
and the Sexual Predator Identification task. We approached these tasks with machine …
and the Sexual Predator Identification task. We approached these tasks with machine …
UnibucKernel: An approach for Arabic dialect identification based on multiple string kernels
RT Ionescu, M Popescu - Proceedings of the Third Workshop on …, 2016 - aclanthology.org
The most common approach in text mining classification tasks is to rely on features like
words, part-of-speech tags, stems, or some other high-level linguistic features. Unlike the …
words, part-of-speech tags, stems, or some other high-level linguistic features. Unlike the …