Adaptation algorithms for neural network-based speech recognition: An overview

P Bell, J Fainberg, O Klejch, J Li… - IEEE Open Journal …, 2020 - ieeexplore.ieee.org
We present a structured overview of adaptation algorithms for neural network-based speech
recognition, considering both hybrid hidden Markov model/neural network systems and end …

Simple and effective zero-shot cross-lingual phoneme recognition

Q Xu, A Baevski, M Auli - arXiv preprint arXiv:2109.11680, 2021 - arxiv.org
Recent progress in self-training, self-supervised pretraining and unsupervised learning
enabled well performing speech recognition systems without any labeled data. However, in …

Universal phone recognition with a multilingual allophone system

X Li, S Dalmia, J Li, M Lee, P Littell… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org
Multilingual models can improve language processing, particularly for low resource
situations, by sharing parameters across languages. Multilingual acoustic models, however …

Master-asr: achieving multilingual scalability and low-resource adaptation in asr with modular learning

Z Yu, Y Zhang, K Qian, C Wan, Y Fu… - International …, 2023 - proceedings.mlr.press
Despite the impressive performance recently achieved by automatic speech recognition
(ASR), we observe two primary challenges that hinder its broader applications:(1) The …

Uwspeech: Speech to speech translation for unwritten languages

C Zhang, X Tan, Y Ren, T Qin, K Zhang… - Proceedings of the AAAI …, 2021 - ojs.aaai.org
Existing speech to speech translation systems heavily rely on the text of target language:
they usually translate source language either to target text and then synthesize target …

Zero-shot learning for grapheme to phoneme conversion with language ensemble

X Li, F Metze, DR Mortensen… - Findings of the …, 2022 - aclanthology.org
Abstract Grapheme-to-Phoneme (G2P) has many applications in NLP and speech fields.
Most existing work focuses heavily on languages with abundant training datasets, which …

An artificially intelligent approach for automatic speech processing based on triune ontology and adaptive tribonacci deep neural networks

G Deepak, D Surya, I Trivedi, A Kumar… - Computers & Electrical …, 2022 - Elsevier
Abstract Automatic Speech Recognition systems have become essential for an independent
automation during the present-day era. A hybrid approach for Automatic Speech …

[PDF][PDF] Hierarchical Phone Recognition with Compositional Phonetics.

X Li, J Li, F Metze, AW Black - Interspeech, 2021 - isca-archive.org
There is growing interest in building phone recognition systems for low-resource languages
as the majority of languages do not have any writing systems. Phone recognition systems …

Test-time adaptation toward personalized speech enhancement: Zero-shot learning with knowledge distillation

S Kim, M Kim - 2021 IEEE Workshop on Applications of Signal …, 2021 - ieeexplore.ieee.org
In realistic speech enhancement settings for end-user devices, we often encounter only a
few speakers and noise types that tend to reoccur in the specific acoustic environment. We …

Differentiable allophone graphs for language-universal speech recognition

B Yan, S Dalmia, DR Mortensen, F Metze… - arXiv preprint arXiv …, 2021 - arxiv.org
Building language-universal speech recognition systems entails producing phonological
units of spoken sound that can be shared across languages. While speech annotations at …