Improving hybrid ctc/attention end-to-end speech recognition with pretrained acoustic and language models
Recently, self-supervised pretraining has achieved impressive results in end-to-end (E2E)
automatic speech recognition (ASR). However, the dominant sequence-to-sequence (S2S) …
automatic speech recognition (ASR). However, the dominant sequence-to-sequence (S2S) …
Segatron: Segment-aware transformer for language modeling and understanding
Transformers are powerful for sequence modeling. Nearly all state-of-the-art language
models and pre-trained language models are based on the Transformer architecture …
models and pre-trained language models are based on the Transformer architecture …