[PDF][PDF] A Comparison of sequence-to-sequence models for speech recognition.

R Prabhavalkar, K Rao, TN Sainath, B Li, L Johnson… - Interspeech, 2017 - isca-archive.org
Interspeech, 2017isca-archive.org
In this work, we conduct a detailed evaluation of various allneural, end-to-end trained,
sequence-to-sequence models applied to the task of speech recognition. Notably, each of
these systems directly predicts graphemes in the written domain, without using an external
pronunciation lexicon, or a separate language model. We examine several sequence-to-
sequence models including connectionist temporal classification (CTC), the recurrent neural
network (RNN) transducer, an attentionbased model, and a model which augments the RNN …
Abstract
In this work, we conduct a detailed evaluation of various allneural, end-to-end trained, sequence-to-sequence models applied to the task of speech recognition. Notably, each of these systems directly predicts graphemes in the written domain, without using an external pronunciation lexicon, or a separate language model. We examine several sequence-to-sequence models including connectionist temporal classification (CTC), the recurrent neural network (RNN) transducer, an attentionbased model, and a model which augments the RNN transducer with an attention mechanism.
isca-archive.org
以上显示的是最相近的搜索结果。 查看全部搜索结果