[PDF][PDF] Overview of the 5th author profiling task at pan 2017: Gender and language variety identification in twitter

F Rangel, P Rosso, M Potthast… - Working notes papers of …, 2017 - downloads.webis.de
This overview presents the framework and the results of the Author Profiling task at PAN
2017. The objective of this year is to address gender and language variety identification. For …

Automatic language identification in texts: A survey

T Jauhiainen, M Lui, M Zampieri, T Baldwin… - Journal of Artificial …, 2019 - jair.org
Language identification (" LI") is the problem of determining the natural language that a
document or part thereof is written in. Automatic LI has been extensively researched for over …

A low dimensionality representation for language variety identification

F Rangel, M Franco-Salvador, P Rosso - International Conference on …, 2016 - Springer
Abstract Language variety identification aims at labelling texts in a native language (eg
Spanish, Portuguese, English) with its specific variation (eg Argentina, Chile, Mexico, Peru …

A systematic study of knowledge graph analysis for cross-language plagiarism detection

M Franco-Salvador, P Rosso… - Information Processing & …, 2016 - Elsevier
Cross-language plagiarism detection aims to detect plagiarised fragments of text among
documents in different languages. In this paper, we perform a systematic examination of …

[PDF][PDF] PAN 2017: Author Profiling-Gender and Language Variety Prediction.

M Martinc, I Skrjanec, K Zupan, S Pollak - CLEF (working notes), 2017 - academia.edu
We present the results of gender and language variety identification performed on the tweet
corpus prepared for the PAN 2017 Author profiling shared task. Our approach consists of …

Best practices of convolutional neural networks for question classification

M Pota, M Esposito, G De Pietro, H Fujita - Applied Sciences, 2020 - mdpi.com
Question Classification (QC) is of primary importance in question answering systems, since
it enables extraction of the correct answer type. State-of-the-art solutions for short text …

[PDF][PDF] CATS: A tool for customized alignment of text simplification corpora

S Štajner, M Franco-Salvador, P Rosso… - Proceedings of the …, 2018 - aclanthology.org
In text simplification (TS), parallel corpora consisting of original sentences and their
manually simplified counterparts are very scarce and small in size, which impedes building …

A survey on author profiling, deception, and irony detection for the arabic language

P Rosso, F Rangel, IH Farías, L Cagnina… - Language and …, 2018 - Wiley Online Library
The possibility of knowing people traits on the basis of what they write is a field of growing
interest named author profiling. To infer a user's gender, age, native language, language …

Uh-prhlt at semeval-2016 task 3: Combining lexical and semantic-based features for community question answering

M Franco-Salvador, S Kar, T Solorio… - arXiv preprint arXiv …, 2018 - arxiv.org
In this work we describe the system built for the three English subtasks of the SemEval 2016
Task 3 by the Department of Computer Science of the University of Houston (UH) and the …

A new aligned simple German corpus

V Toborek, M Busch, M Boßert, C Bauckhage… - arXiv preprint arXiv …, 2022 - arxiv.org
" Leichte Sprache", the German counterpart to Simple English, is a regulated language
aiming to facilitate complex written language that would otherwise stay inaccessible to …