A survey on non-autoregressive generation for neural machine translation and beyond
Non-autoregressive (NAR) generation, which is first proposed in neural machine translation
(NMT) to speed up inference, has attracted much attention in both machine learning and …
(NMT) to speed up inference, has attracted much attention in both machine learning and …
Code-switching text generation and injection in mandarin-english asr
Code-switching speech refers to a means of expression by mixing two or more languages
within a single utterance. Automatic Speech Recognition (ASR) with End-to-End (E2E) …
within a single utterance. Automatic Speech Recognition (ASR) with End-to-End (E2E) …
Mandarin-english code-switching speech recognition with self-supervised speech representation models
Code-switching (CS) is common in daily conversations where more than one language is
used within a sentence. The difficulties of CS speech recognition lie in alternating languages …
used within a sentence. The difficulties of CS speech recognition lie in alternating languages …
Language-specific acoustic boundary learning for mandarin-english code-switching speech recognition
Code-switching speech recognition (CSSR) transcribes speech that switches between
multiple languages or dialects within a single sentence. The main challenge in this task is …
multiple languages or dialects within a single sentence. The main challenge in this task is …
Cyclic Transfer Learning for Mandarin-English Code-Switching Speech Recognition
Transfer learning is a common method to improve the performance of the model on a target
task via pre-training the model on pretext tasks. Different from the methods using …
task via pre-training the model on pretext tasks. Different from the methods using …
Minimum word error training for non-autoregressive transformer-based code-switching asr
Non-autoregressive end-to-end ASR framework might be potentially appropriate for code-
switching recognition task thanks to its inherent property that present output token being …
switching recognition task thanks to its inherent property that present output token being …
LAE-ST-MOE: Boosted Language-Aware Encoder Using Speech Translation Auxiliary Task for E2E Code-Switching ASR
G Ma, W Wang, Y Li, Y Yang, B Du… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org
Recently, to mitigate the confusion between different languages in code-switching (CS)
automatic speech recognition (ASR), the conditionally factorized models, such as the …
automatic speech recognition (ASR), the conditionally factorized models, such as the …
[PDF][PDF] Improving Recognition of Out-of-vocabulary Words in E2E Code-switching ASR by Fusing Speech Generation Methods.
Abstract Out-of-vocabulary (OOV) is a common problem for end-to-end (E2E) ASR. For code-
switching (CS), the OOV problem on the embedded language is further aggravated and …
switching (CS), the OOV problem on the embedded language is further aggravated and …
Context Conditioning via Surrounding Predictions for Non-recurrent CTC Models
B Naowarat, C Piansaddhayanon… - IEEE …, 2023 - ieeexplore.ieee.org
Connectionist Temporal Classification (CTC) loss has become widely used in sequence
modeling tasks such as Automatic Speech Recognition (ASR) and Handwritten Text …
modeling tasks such as Automatic Speech Recognition (ASR) and Handwritten Text …
Romanization Encoding For Multilingual ASR
W Ding, F Jia, H Xu, Y Xi, J Lai, B Ginsburg - arXiv preprint arXiv …, 2024 - arxiv.org
We introduce romanization encoding for script-heavy languages to optimize multilingual and
code-switching Automatic Speech Recognition (ASR) systems. By adopting romanization …
code-switching Automatic Speech Recognition (ASR) systems. By adopting romanization …