Fine-grained morphosyntactic analysis and generation tools for more than one thousand languages

AD McCarthy, C Kirov, M Grella… - … of The 12th …, 2020 - research-collection.ethz.ch

The Universal Morphology (UniMorph) project is a collaborative effort providing broad-
coverage instantiated normalized morphological paradigms for hundreds of diverse world …

被引用次数：96 相关文章所有 11 个版本

[PDF] aclanthology.org

The Johns Hopkins University Bible corpus: 1600+ tongues for typological exploration

AD McCarthy, R Wicks, D Lewis, A Mueller… - Proceedings of the …, 2020 - aclanthology.org

We present findings from the creation of a massively parallel corpus in over 1600
languages, the Johns Hopkins University Bible Corpus (JHUBC). The corpus consists of …

被引用次数：80 相关文章所有 3 个版本

[PDF] arxiv.org

Pre-trained multilingual sequence-to-sequence models: A hope for low-resource language translation?

ESA Lee, S Thillainathan, S Nayak… - arXiv preprint arXiv …, 2022 - arxiv.org

What can pre-trained multilingual sequence-to-sequence models like mBART contribute to
translating low-resource languages? We conduct a thorough empirical experiment in 10 …

被引用次数：31 相关文章所有 6 个版本

[PDF] arxiv.org

Morphological Processing of Low-Resource Languages: Where We Are and What's Next

A Wiemerslage, M Silfverberg, C Yang… - arXiv preprint arXiv …, 2022 - arxiv.org

Automatic morphological processing can aid downstream natural language processing
applications, especially for low-resource languages, and assist language documentation …

被引用次数：12 相关文章所有 5 个版本

[PDF] arxiv.org

The SIGMORPHON 2020 shared task on unsupervised morphological paradigm completion

K Kann, A McCarthy, G Nicolai, M Hulden - arXiv preprint arXiv …, 2020 - arxiv.org

In this paper, we describe the findings of the SIGMORPHON 2020 shared task on
unsupervised morphological paradigm completion (SIGMORPHON 2020 Task 2), a novel …

被引用次数：20 相关文章所有 9 个版本

[PDF] arxiv.org

Meeting the needs of low-resource languages: The value of automatic alignments via pretrained models

A Ebrahimi, AD McCarthy, A Oncevay… - arXiv preprint arXiv …, 2023 - arxiv.org

Large multilingual models have inspired a new class of word alignment methods, which
work well for the model's pretraining languages. However, the languages most in need of …

被引用次数：5 相关文章所有 3 个版本

[PDF] aclanthology.org

The SIGMORPHON 2022 Shared Task on Cross-lingual and Low-Resource Grapheme-to-Phoneme Conversion

AD McCarthy, JL Lee, A DeLucia… - Proceedings of the …, 2023 - aclanthology.org

Grapheme-to-phoneme conversion is an important component in many speech
technologies, but until recently there were no multilingual benchmarks for this task. The third …

被引用次数：2 相关文章所有 3 个版本

[PDF] aclanthology.org

Joint learning model for low-resource agglutinative language morphological tagging

G Abudouwaili, K Abiderexiti, N Yi… - Proceedings of the 20th …, 2023 - aclanthology.org

Due to the lack of data resources, rule-based or transfer learning is mainly used in the
morphological tagging of low-resource languages. However, these methods require expert …

被引用次数：2 相关文章所有 3 个版本

[PDF] aclanthology.org

Codex to corpus: Exploring annotation and processing for an open and extensible machine-readable edition of the Florentine Codex

F Tyers, R Pugh, V Berthoud - Proceedings of the Workshop on …, 2023 - aclanthology.org

This paper describes an ongoing effort to create, from the original hand-written text, a
machine-readable, linguistically-annotated, and easily-searchable corpus of the Nahuatl …

被引用次数：4 相关文章所有 3 个版本

[PDF] aclanthology.org

Developing finite-state language technology for maya

R Pugh, F Tyers, Q Castañeda - Proceedings of the Workshop on …, 2023 - aclanthology.org

We describe a suite of finite-state language technologies for Maya, a Mayan language
spoken in Mexico. At the core is a computational model of Maya morphology and phonology …

被引用次数：2 相关文章所有 2 个版本