Towards zero-shot learning for automatic phonemic transcription

P Bell, J Fainberg, O Klejch, J Li… - IEEE Open Journal …, 2020 - ieeexplore.ieee.org

We present a structured overview of adaptation algorithms for neural network-based speech
recognition, considering both hybrid hidden Markov model/neural network systems and end …

被引用次数：98 相关文章所有 7 个版本

[PDF] arxiv.org

Simple and effective zero-shot cross-lingual phoneme recognition

Q Xu, A Baevski, M Auli - arXiv preprint arXiv:2109.11680, 2021 - arxiv.org

Recent progress in self-training, self-supervised pretraining and unsupervised learning
enabled well performing speech recognition systems without any labeled data. However, in …

被引用次数：87 相关文章所有 6 个版本

[PDF] arxiv.org

Universal phone recognition with a multilingual allophone system

X Li, S Dalmia, J Li, M Lee, P Littell… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org

Multilingual models can improve language processing, particularly for low resource
situations, by sharing parameters across languages. Multilingual acoustic models, however …

被引用次数：143 相关文章所有 10 个版本

[PDF] mlr.press

Master-asr: achieving multilingual scalability and low-resource adaptation in asr with modular learning

Z Yu, Y Zhang, K Qian, C Wan, Y Fu… - International …, 2023 - proceedings.mlr.press

Despite the impressive performance recently achieved by automatic speech recognition
(ASR), we observe two primary challenges that hinder its broader applications:(1) The …

被引用次数：10 相关文章所有 7 个版本

[PDF] aaai.org

Uwspeech: Speech to speech translation for unwritten languages

C Zhang, X Tan, Y Ren, T Qin, K Zhang… - Proceedings of the AAAI …, 2021 - ojs.aaai.org

Existing speech to speech translation systems heavily rely on the text of target language:
they usually translate source language either to target text and then synthesize target …

被引用次数：57 相关文章所有 4 个版本

[PDF] aclanthology.org

Zero-shot learning for grapheme to phoneme conversion with language ensemble

X Li, F Metze, DR Mortensen… - Findings of the …, 2022 - aclanthology.org

Abstract Grapheme-to-Phoneme (G2P) has many applications in NLP and speech fields.
Most existing work focuses heavily on languages with abundant training datasets, which …

被引用次数：22 相关文章所有 6 个版本

An artificially intelligent approach for automatic speech processing based on triune ontology and adaptive tribonacci deep neural networks

G Deepak, D Surya, I Trivedi, A Kumar… - Computers & Electrical …, 2022 - Elsevier

Abstract Automatic Speech Recognition systems have become essential for an independent
automation during the present-day era. A hybrid approach for Automatic Speech …

被引用次数：15 相关文章所有 2 个版本

[PDF] isca-archive.org

[PDF][PDF] Hierarchical Phone Recognition with Compositional Phonetics.

X Li, J Li, F Metze, AW Black - Interspeech, 2021 - isca-archive.org

There is growing interest in building phone recognition systems for low-resource languages
as the majority of languages do not have any writing systems. Phone recognition systems …

被引用次数：15 相关文章所有 5 个版本

[PDF] arxiv.org

Test-time adaptation toward personalized speech enhancement: Zero-shot learning with knowledge distillation

S Kim, M Kim - 2021 IEEE Workshop on Applications of Signal …, 2021 - ieeexplore.ieee.org

In realistic speech enhancement settings for end-user devices, we often encounter only a
few speakers and noise types that tend to reoccur in the specific acoustic environment. We …

被引用次数：18 相关文章所有 7 个版本

[PDF] arxiv.org

Differentiable allophone graphs for language-universal speech recognition

B Yan, S Dalmia, DR Mortensen, F Metze… - arXiv preprint arXiv …, 2021 - arxiv.org

Building language-universal speech recognition systems entails producing phonological
units of spoken sound that can be shared across languages. While speech annotations at …

被引用次数：12 相关文章所有 8 个版本