Fine-tuning pre-trained models for Automatic Speech Recognition, experiments on a fieldwork corpus of Japhug (Trans-Himalayan family)
S Guillaume, G Wisniewski, C Macaire… - Proceedings of the …, 2022 - aclanthology.org
This is a report on results obtained in the development of speech recognition tools intended
to support linguistic documentation efforts. The test case is an extensive fieldwork corpus of …
to support linguistic documentation efforts. The test case is an extensive fieldwork corpus of …
[PDF][PDF] Dialect text normalization to normative standard Finnish
N Partanen, M Hämäläinen… - Workshop on Noisy …, 2019 - researchportal.helsinki.fi
We compare different LSTMs and transformer models in terms of their effectiveness in
normalizing dialectal Finnish into the normative standard Finnish. As dialect is the common …
normalizing dialectal Finnish into the normative standard Finnish. As dialect is the common …
Dependency parsing of code-switching data with cross-lingual feature representations
This paper describes the test of a dependency parsing method which is based on
bidirectional LSTM feature representations and multilingual word embedding, and evaluates …
bidirectional LSTM feature representations and multilingual word embedding, and evaluates …
[PDF][PDF] Multilingual dependency parsing for low-resource languages: Case studies on north saami and komi-zyrian
KT Lim, N Partanen, T Poibeau - Proceedings of the Eleventh …, 2018 - aclanthology.org
The paper presents a method for parsing low-resource languages with very small training
corpora using multilingual word embeddings and annotated corpora of larger languages …
corpora using multilingual word embeddings and annotated corpora of larger languages …
[PDF][PDF] Using computational approaches to integrate endangered language legacy data into documentation corpora: Past experiences and challenges ahead
The systematic integration of pre-digital published transcriptions of legacy language
materials offers many possibilities to enrich documentary corpora with data that is often very …
materials offers many possibilities to enrich documentary corpora with data that is often very …
[PDF][PDF] Instant annotations in ELAN corpora of spoken and written Komi, an endangered language of the Barents Sea region
The paper describes work-in-progress by the Izhva Komi language documentation project,
which records new spoken language data, digitizes available recordings and annotate these …
which records new spoken language data, digitizes available recordings and annotate these …
The relevance of the source language in transfer learning for ASR
This study presents new experiments on Zyrian Komi speech recognition. We use Deep-
Speech to train ASR models from a language documentation corpus that contains both …
Speech to train ASR models from a language documentation corpus that contains both …
[PDF][PDF] SpoCo–a simple and adaptable web interface for dialect corpora
R von Waldenfels, M Woźniak - Journal for language technology and …, 2016 - jlcl.org
We present SpoCo, a simple, yet effective system for the web-based query of dialect corpora
encoded in ELAN that provides users with advanced concordancing functions, as well as the …
encoded in ELAN that provides users with advanced concordancing functions, as well as the …
[PDF][PDF] Instant annotations–Applying NLP methods to the annotation of spoken language documentation corpora
The paper describes work-in-progress by the Pite Saami, Kola Saami and Izhva Komi
language documentation projects, all of which use similar data and technical frameworks …
language documentation projects, all of which use similar data and technical frameworks …
[PDF][PDF] Documenting endangered oral histories of the Arctic: A proposed symbiosis for documentary linguistics and oral history research, illustrated by Saami and Komi …
In this chapter, we argue that documentary linguistics, particularly as we practice it in our
own projects, can provide valuable resources for social science research. Especially in our …
own projects, can provide valuable resources for social science research. Especially in our …