A survey on non-autoregressive generation for neural machine translation and beyond

Y Xiao, L Wu, J Guo, J Li, M Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Non-autoregressive (NAR) generation, which is first proposed in neural machine translation
(NMT) to speed up inference, has attracted much attention in both machine learning and …

A comparative study on non-autoregressive modelings for speech-to-text generation

Y Higuchi, N Chen, Y Fujita, H Inaguma… - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org
Non-autoregressive (NAR) models simultaneously generate multiple outputs in a sequence,
which significantly reduces the inference speed at the cost of accuracy drop compared to …

Non-autoregressive asr modeling using pre-trained language models for chinese speech recognition

FH Yu, KY Chen, KH Lu - IEEE/ACM Transactions on Audio …, 2022 - ieeexplore.ieee.org
Transformer-based models have led to significant innovation in various classic and practical
subjects, including speech processing, natural language processing, and computer vision …

A ctc alignment-based non-autoregressive transformer for end-to-end automatic speech recognition

R Fan, W Chu, P Chang, A Alwan - IEEE/ACM Transactions on …, 2023 - ieeexplore.ieee.org
Recently, end-to-end models have been widely used in automatic speech recognition (ASR)
systems. Two of the most representative approaches are connectionist temporal …

SeACo-Paraformer: A non-autoregressive ASR system with flexible and effective hotword customization ability

X Shi, Y Yang, Z Li, Y Chen, Z Gao… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
Hotword customization is one of the concerned issues remained in ASR field-it is of value to
enable users of ASR systems to customize names of entities, persons and other phrases to …

BA-MoE: Boundary-Aware Mixture-of-Experts Adapter for Code-Switching Speech Recognition

P Chen, F Yu, Y Liang, H Xue, X Wan… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org
Mixture-of-experts based models, which use language experts to extract language-specific
representations effectively, have been well applied in code-switching automatic speech …

Decoupling recognition and transcription in mandarin asr

J Yuan, X Cai, D Gao, R Zheng… - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org
Much of the recent literature on automatic speech recognition (ASR) is taking an end-to-end
approach. Unlike English where the writing system is closely related to sound, Chinese …

Contextualized End-to-end Automatic Speech Recognition with Intermediate Biasing Loss

M Shakeel, Y Sudo, Y Peng, S Watanabe - arXiv preprint arXiv …, 2024 - arxiv.org
Contextualized end-to-end automatic speech recognition has been an active research area,
with recent efforts focusing on the implicit learning of contextual phrases based on the final …

LV-CTC: Non-autoregressive ASR with CTC and latent variable models

Y Fujita, S Watanabe, X Chang… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org
Non-autoregressive (NAR) models for automatic speech recognition (ASR) aim to achieve
high accuracy and fast inference by simplifying the autoregressive (AR) generation process …

Achieving timestamp prediction while recognizing with non-autoregressive end-to-end asr model

X Shi, Y Chen, S Zhang, Z Yan - National Conference on Man-Machine …, 2022 - Springer
Conventional ASR systems use frame-level phoneme posterior to conduct force-alignment
(FA) and provide timestamps, while end-to-end ASR systems especially AED based ones …