Injecting text in self-supervised speech pretraining

Z Chen, Y Zhang, A Rosenberg… - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org
Self-supervised pretraining for Automated Speech Recognition (ASR) has shown varied
degrees of success. In this paper, we propose to jointly learn representations during …

[PDF][PDF] Improving Speech Recognition Using GAN-Based Speech Synthesis and Contrastive Unspoken Text Selection.

Z Chen, A Rosenberg, Y Zhang, G Wang… - …, 2020 - interspeech2020.org
Text-to-Speech synthesis (TTS) based data augmentation is a relatively new mechanism for
utilizing text-only data to improve automatic speech recognition (ASR) training without …