Injecting text in self-supervised speech pretraining
Self-supervised pretraining for Automated Speech Recognition (ASR) has shown varied
degrees of success. In this paper, we propose to jointly learn representations during …
degrees of success. In this paper, we propose to jointly learn representations during …
[PDF][PDF] Improving Speech Recognition Using GAN-Based Speech Synthesis and Contrastive Unspoken Text Selection.
Text-to-Speech synthesis (TTS) based data augmentation is a relatively new mechanism for
utilizing text-only data to improve automatic speech recognition (ASR) training without …
utilizing text-only data to improve automatic speech recognition (ASR) training without …