Fleurs: Few-shot learning evaluation of universal representations of speech
We introduce FLEURS, the Few-shot Learning Evaluation of Universal Representations of
Speech benchmark. FLEURS is an n-way parallel speech dataset in 102 languages built on …
Speech benchmark. FLEURS is an n-way parallel speech dataset in 102 languages built on …
[HTML][HTML] Speech and language processing with deep learning for dementia diagnosis: A systematic review
M Shi, G Cheung, SR Shahamiri - Psychiatry Research, 2023 - Elsevier
Dementia is a progressive neurodegenerative disease that burdens the person living with
the disease, their families, and medical and social services. Timely diagnosis of dementia …
the disease, their families, and medical and social services. Timely diagnosis of dementia …
Real-time neural radiance talking portrait synthesis via audio-spatial decomposition
While dynamic Neural Radiance Fields (NeRF) have shown success in high-fidelity 3D
modeling of talking portraits, the slow training and inference speed severely obstruct their …
modeling of talking portraits, the slow training and inference speed severely obstruct their …
A comparison of discrete and soft speech units for improved voice conversion
The goal of voice conversion is to transform source speech into a target voice, keeping the
content unchanged. In this paper, we focus on self-supervised representation learning for …
content unchanged. In this paper, we focus on self-supervised representation learning for …
Simple and effective zero-shot cross-lingual phoneme recognition
Recent progress in self-training, self-supervised pretraining and unsupervised learning
enabled well performing speech recognition systems without any labeled data. However, in …
enabled well performing speech recognition systems without any labeled data. However, in …
Improving massively multilingual asr with auxiliary ctc objectives
Multilingual Automatic Speech Recognition (ASR) models have extended the usability of
speech technologies to a wide variety of languages. With how many languages these …
speech technologies to a wide variety of languages. With how many languages these …
Language ID in the wild: Unexpected challenges on the path to a thousand-language web text corpus
Large text corpora are increasingly important for a wide variety of Natural Language
Processing (NLP) tasks, and automatic language identification (LangID) is a core technology …
Processing (NLP) tasks, and automatic language identification (LangID) is a core technology …
Multilingual speech recognition for Turkic languages
The primary aim of this study was to contribute to the development of multilingual automatic
speech recognition for lower-resourced Turkic languages. Ten languages—Azerbaijani …
speech recognition for lower-resourced Turkic languages. Ten languages—Azerbaijani …
ASR2K: Speech recognition for around 2000 languages without audio
Most recent speech recognition models rely on large supervised datasets, which are
unavailable for many low-resource languages. In this work, we present a speech recognition …
unavailable for many low-resource languages. In this work, we present a speech recognition …
[PDF][PDF] Low Resource ASR: The Surprising Effectiveness of High Resource Transliteration.
Cross-lingual transfer of knowledge from high-resource languages to low-resource
languages is an important research problem in automatic speech recognition (ASR). We …
languages is an important research problem in automatic speech recognition (ASR). We …