Language varieties of Italy: Technology challenges and opportunities

A Ramponi - Transactions of the Association for Computational …, 2024 - direct.mit.edu
Italy is characterized by a one-of-a-kind linguistic diversity landscape in Europe, which
implicitly encodes local knowledge, cultural traditions, artistic expressions, and history of its …

Language variety identification with true labels

M Zampieri, K North, T Jauhiainen, M Felice… - arXiv preprint arXiv …, 2023 - arxiv.org
Language identification is an important first step in many IR and NLP applications. Most
publicly available language identification datasets, however, are compiled under the …

VarDial evaluation campaign 2024: Commonsense reasoning in dialects and multi-label similar language identification

AG Chifu, G Glavaš, RT Ionescu… - Workshop on NLP …, 2024 - researchportal.helsinki.fi
This report presents the results of the shared tasks organized as part of the VarDial
Evaluation Campaign 2024. The campaign is part of the eleventh workshop on Natural …

DiatopIt: A corpus of social media posts for the study of diatopic language variation in Italy

A Ramponi, C Casula - Tenth Workshop on NLP for Similar …, 2023 - aclanthology.org
We introduce DiatopIt, the first corpus specifically focused on diatopic language variation in
Italy for language varieties other than Standard Italian. DiatopIt comprises over 15K …

[PDF][PDF] Italian language and dialect identification and regional French variety detection using adaptive naive Bayes

T Jauhiainen, H Jauhiainen… - International …, 2022 - researchportal.helsinki.fi
These proceedings include the 13 papers presented at the Ninth Workshop on NLP for
Similar Languages, Varieties and Dialects (VarDial), co-located with the 29th International …

DADA: Dialect adaptation via dynamic aggregation of linguistic rules

Y Liu, W Held, D Yang - Proceedings of the 2023 Conference on …, 2023 - aclanthology.org
Existing large language models (LLMs) that mainly focus on Standard American English
(SAE) often lead to significantly worse performance when being applied to other English …

What do dialect speakers want? a survey of attitudes towards language technology for german dialects

V Blaschke, C Purschke, H Schütze, B Plank - arXiv preprint arXiv …, 2024 - arxiv.org
Natural language processing (NLP) has largely focused on modelling standardized
languages. More recently, attention has increasingly shifted to local, non-standardized …

Fine-tuning bert with character-level noise for zero-shot transfer to dialects and closely-related languages

A Srivastava, D Chiang - arXiv preprint arXiv:2303.17683, 2023 - arxiv.org
In this work, we induce character-level noise in various forms when fine-tuning BERT to
enable zero-shot cross-lingual transfer to unseen dialects and languages. We fine-tune …

[PDF][PDF] Optimizing naive Bayes for Arabic dialect identification

T Jauhiainen, H Jauhiainen… - Arabic Natural …, 2022 - researchportal.helsinki.fi
This article describes the language identification system used by the SUKI team in the 2022
Nuanced Arabic Dialect Identification (NADI) shared task. In addition to the system …

Dialect and variant identification as a multi-label classification task: A proposal based on near-duplicate analysis

G Bernier-Colborne, C Goutte… - Tenth Workshop on NLP …, 2023 - aclanthology.org
We argue that dialect identification should be treated as a multi-label classification problem
rather than the single-class setting prevalent in existing collections and evaluations. In order …