A survey on non-autoregressive generation for neural machine translation and beyond

Y Xiao, L Wu, J Guo, J Li, M Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Non-autoregressive (NAR) generation, which is first proposed in neural machine translation
(NMT) to speed up inference, has attracted much attention in both machine learning and …

Code-switching text generation and injection in mandarin-english asr

H Yu, Y Hu, Y Qian, M Jin, L Liu, S Liu… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Code-switching speech refers to a means of expression by mixing two or more languages
within a single utterance. Automatic Speech Recognition (ASR) with End-to-End (E2E) …

Mandarin-english code-switching speech recognition with self-supervised speech representation models

LH Tseng, YK Fu, HJ Chang, H Lee - arXiv preprint arXiv:2110.03504, 2021 - arxiv.org
Code-switching (CS) is common in daily conversations where more than one language is
used within a sentence. The difficulties of CS speech recognition lie in alternating languages …

Language-specific acoustic boundary learning for mandarin-english code-switching speech recognition

Z Fan, L Dong, C Shen, Z Liang, J Zhang, L Lu… - arXiv preprint arXiv …, 2023 - arxiv.org
Code-switching speech recognition (CSSR) transcribes speech that switches between
multiple languages or dialects within a single sentence. The main challenge in this task is …

Cyclic Transfer Learning for Mandarin-English Code-Switching Speech Recognition

CH Nga, DQ Vu, HH Luong, CL Huang… - IEEE Signal …, 2023 - ieeexplore.ieee.org
Transfer learning is a common method to improve the performance of the model on a target
task via pre-training the model on pretext tasks. Different from the methods using …

Minimum word error training for non-autoregressive transformer-based code-switching asr

Y Peng, J Zhang, H Xu, H Huang… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
Non-autoregressive end-to-end ASR framework might be potentially appropriate for code-
switching recognition task thanks to its inherent property that present output token being …

LAE-ST-MOE: Boosted Language-Aware Encoder Using Speech Translation Auxiliary Task for E2E Code-Switching ASR

G Ma, W Wang, Y Li, Y Yang, B Du… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org
Recently, to mitigate the confusion between different languages in code-switching (CS)
automatic speech recognition (ASR), the conditionally factorized models, such as the …

[PDF][PDF] Improving Recognition of Out-of-vocabulary Words in E2E Code-switching ASR by Fusing Speech Generation Methods.

L Ye, G Cheng, R Yang, Z Yang, S Tian, P Zhang… - …, 2022 - researchgate.net
Abstract Out-of-vocabulary (OOV) is a common problem for end-to-end (E2E) ASR. For code-
switching (CS), the OOV problem on the embedded language is further aggravated and …

Context Conditioning via Surrounding Predictions for Non-recurrent CTC Models

B Naowarat, C Piansaddhayanon… - IEEE …, 2023 - ieeexplore.ieee.org
Connectionist Temporal Classification (CTC) loss has become widely used in sequence
modeling tasks such as Automatic Speech Recognition (ASR) and Handwritten Text …

Romanization Encoding For Multilingual ASR

W Ding, F Jia, H Xu, Y Xi, J Lai, B Ginsburg - arXiv preprint arXiv …, 2024 - arxiv.org
We introduce romanization encoding for script-heavy languages to optimize multilingual and
code-switching Automatic Speech Recognition (ASR) systems. By adopting romanization …