Improving hybrid ctc/attention end-to-end speech recognition with pretrained acoustic and language models

K Deng, S Cao, Y Zhang, L Ma - 2021 IEEE Automatic Speech …, 2021 - ieeexplore.ieee.org
Recently, self-supervised pretraining has achieved impressive results in end-to-end (E2E)
automatic speech recognition (ASR). However, the dominant sequence-to-sequence (S2S) …

Segatron: Segment-aware transformer for language modeling and understanding

H Bai, P Shi, J Lin, Y Xie, L Tan, K Xiong… - Proceedings of the …, 2021 - ojs.aaai.org
Transformers are powerful for sequence modeling. Nearly all state-of-the-art language
models and pre-trained language models are based on the Transformer architecture …