Integrating automatic transcription into the language documentation workflow: Experiments with Na data and the Persephone toolkit
Automatic speech recognition tools have potential for facilitating language documentation,
but in practice these tools remain little-used by linguists for a variety of reasons, such as that …
but in practice these tools remain little-used by linguists for a variety of reasons, such as that …
Automatic speech recognition for supporting endangered language documentation
E Prud'hommeaux, R Jimerson, R Hatcher… - 2021 - scholarspace.manoa.hawaii.edu
Generating accurate word-level transcripts of recorded speech for language documentation
is difficult and time-consuming, even for skilled speakers of the target language. Automatic …
is difficult and time-consuming, even for skilled speakers of the target language. Automatic …
User-friendly automatic transcription of low-resource languages: Plugging ESPnet into Elpis
O Adams, B Galliot, G Wisniewski… - arXiv preprint arXiv …, 2020 - arxiv.org
This paper reports on progress integrating the speech recognition toolkit ESPnet into Elpis, a
web front-end originally designed to provide access to the Kaldi automatic speech …
web front-end originally designed to provide access to the Kaldi automatic speech …
Utilizing language technology in the documentation of endangered Uralic languages
The paper describes work-in-progress by the Pite Saami, Kola Saami and Izhva Komi
language documentation projects, all of which record new spoken language data, digitize …
language documentation projects, all of which record new spoken language data, digitize …
[PDF][PDF] Multilingual dependency parsing for low-resource languages: Case studies on north saami and komi-zyrian
KT Lim, N Partanen, T Poibeau - Proceedings of the Eleventh …, 2018 - aclanthology.org
The paper presents a method for parsing low-resource languages with very small training
corpora using multilingual word embeddings and annotated corpora of larger languages …
corpora using multilingual word embeddings and annotated corpora of larger languages …
[PDF][PDF] Using computational approaches to integrate endangered language legacy data into documentation corpora: Past experiences and challenges ahead
The systematic integration of pre-digital published transcriptions of legacy language
materials offers many possibilities to enrich documentary corpora with data that is often very …
materials offers many possibilities to enrich documentary corpora with data that is often very …
Is There Any Hope for Developing Automated Translation Technology for Sign Languages?
This article discusses the prerequisites for the machine translation of sign languages. The
topic is complex, including questions relating to technology, interaction design, linguistics …
topic is complex, including questions relating to technology, interaction design, linguistics …
[PDF][PDF] Instant annotations in ELAN corpora of spoken and written Komi, an endangered language of the Barents Sea region
The paper describes work-in-progress by the Izhva Komi language documentation project,
which records new spoken language data, digitizes available recordings and annotate these …
which records new spoken language data, digitizes available recordings and annotate these …
[PDF][PDF] On editing dictionaries for uralic languages in an online environment
We present an open online infrastructure for editing and visualization of dictionaries of
different Uralic languages (eg Erzya, Moksha, Skolt Sami and Komi-Zyrian). Our …
different Uralic languages (eg Erzya, Moksha, Skolt Sami and Komi-Zyrian). Our …
[PDF][PDF] Building minority dependency treebanks, dictionaries and computational grammars at the same time—an experiment in Karelian treebanking
TA Pirinen - Proceedings of the Third Workshop on Universal …, 2019 - aclanthology.org
Building a treebank from scratch can easily be an elaborate, highly time consuming task,
especially when working with a minority language with moderately complex morphology and …
especially when working with a minority language with moderately complex morphology and …