Automatic normalisation of early Modern French

O Kuparinen, A Miletić, Y Scherrer - Findings of the Association for …, 2023 - aclanthology.org

Text normalization methods have been commonly applied to historical language or user-
generated content, but less often to dialectal transcriptions. In this paper, we introduce …

被引用次数：7 相关文章所有 10 个版本

[PDF] aclanthology.org

Murreviikko-a dialectologically annotated and normalized dataset of Finnish tweets

O Kuparinen - Tenth Workshop on NLP for Similar Languages …, 2023 - aclanthology.org

This paper presents Murreviikko, a dataset of dialectal Finnish tweets which have been
dialectologically annotated and manually normalized to a standard form. The dataset can be …

被引用次数：4 相关文章所有 6 个版本

[PDF] aclanthology.org

Dialect representation learning with neural dialect-to-standard normalization

O Kuparinen, Y Scherrer - Tenth Workshop on NLP for Similar …, 2023 - aclanthology.org

Abstract Language label tokens are often used in multilingual neural language modeling
and sequence-to-sequence learning to enhance the performance of such models. An …

被引用次数：4 相关文章所有 6 个版本

[PDF] aclanthology.org

Automatic Normalisation of Middle French and its Impact on Productivity

R Rubino, S Coram-Mekkey, J Gerlach… - Proceedings of the …, 2024 - aclanthology.org

This paper presents a study on automatic normalisation of 16th century documents written in
Middle French. These documents present a large variety of wordforms which require …

被引用次数：1 相关文章所有 4 个版本

[PDF] ipipan.waw.pl

Evaluating the Use of Generative LLMs for Intralingual Diachronic Translation of Middle-Polish Texts into Contemporary Polish

C Klamra, K Kryńska, M Ogrodniczuk - International Conference on Asian …, 2023 - Springer

This paper presents efforts towards creating a tool for translating texts from Middle Polish
into modern Polish. Archaic texts sourced from the CBDU digital library were translated into …

被引用次数：1 相关文章所有 3 个版本

[PDF] aclanthology.org

CorCoDial-Machine translation techniques for corpus-based computational dialectology

Y Scherrer, O Kuparinen, A Miletić - Proceedings of the 24th …, 2023 - aclanthology.org

This paper presents CorCoDial, a research project funded by the Academy of Finland
aiming to leverage machine translation technology for corpus-based computational …

被引用次数：1 相关文章所有 5 个版本

[PDF] arxiv.org

Modeling Orthographic Variation in Occitan's Dialects

ZW Hopton, N Aepli - arXiv preprint arXiv:2404.19315, 2024 - arxiv.org

Effectively normalizing textual data poses a considerable challenge, especially for low-
resource languages lacking standardized writing systems. In this study, we fine-tuned a …

Normalizing without Modernizing: Keeping Historical Wordforms of Middle French while Reducing Spelling Variants

R Rubino, J Gerlach, J Mutal… - Findings of the …, 2024 - aclanthology.org

Conservation of historical documents benefits from computational methods by alleviating the
manual labor related to digitization and modernization of textual content. Languages usually …

被引用次数：1 相关文章

[PDF] hal.science

Le projet FREEM: ressources, outils et enjeux pour l'étude du français d'Ancien Régime

S Gabay, PO Suarez, R Bawden, A Bartz… - TALN 2022-Traitement …, 2022 - hal.science

En dépit de leur qualité certaine, les ressources et outils disponibles pour l'analyse du
français d'Ancien Régime ne sont plus à même de répondre aux enjeux de la recherche en …

被引用次数：1 相关文章所有 10 个版本

[PDF] ed.ac.uk

A transformer-based standardisation system for Scottish Gaelic

J Huang, B Alex, M Bauer, DS Jasin… - … of SIGUL 2023: 2nd …, 2023 - research.ed.ac.uk

The transition from rule-based to neural-based architectures has made it more difficult for
low-resource languages like Scottish Gaelic to participate in modern language technologies …