相关文章- 学术资源搜索

[PDF][PDF] A Comparison of sequence-to-sequence models for speech recognition.

R Prabhavalkar, K Rao, TN Sainath, B Li, L Johnson… - Interspeech, 2017 - isca-archive.org

In this work, we conduct a detailed evaluation of various allneural, end-to-end trained,
sequence-to-sequence models applied to the task of speech recognition. Notably, each of …

被引用次数：399 相关文章所有 7 个版本

[PDF] isca-archive.org

[PDF][PDF] Recurrent neural aligner: An encoder-decoder neural network model for sequence to sequence mapping.

H Sak, M Shannon, K Rao, F Beaufays - Interspeech, 2017 - isca-archive.org

We introduce an encoder-decoder recurrent neural network model called Recurrent Neural
Aligner (RNA) that can be used for sequence to sequence mapping tasks. Like connectionist …

被引用次数：153 相关文章所有 4 个版本

[PDF] arxiv.org

Exploring architectures, data and units for streaming end-to-end speech recognition with rnn-transducer

K Rao, H Sak, R Prabhavalkar - 2017 IEEE automatic speech …, 2017 - ieeexplore.ieee.org

We investigate training end-to-end speech recognition models with the recurrent neural
network transducer (RNN-T): a streaming, all-neural, sequence-to-sequence architecture …

被引用次数：422 相关文章所有 6 个版本

Multi-accent speech recognition with hierarchical grapheme based models

K Rao, H Sak - … conference on acoustics, speech and signal …, 2017 - ieeexplore.ieee.org

We train grapheme-based acoustic models for speech recognition using a hierarchical
recurrent neural network architecture with connectionist temporal classification (CTC) loss …

被引用次数：89 相关文章所有 4 个版本

[PDF] arxiv.org

Direct acoustics-to-word models for english conversational speech recognition

K Audhkhasi, B Ramabhadran, G Saon… - arXiv preprint arXiv …, 2017 - arxiv.org

Recent work on end-to-end automatic speech recognition (ASR) has shown that the
connectionist temporal classification (CTC) loss can be used to convert acoustics to phone …

被引用次数：168 相关文章所有 9 个版本

[PDF] mlr.press

Towards end-to-end speech recognition with recurrent neural networks

A Graves, N Jaitly - International conference on machine …, 2014 - proceedings.mlr.press

This paper presents a speech recognition system that directly transcribes audio data with
text, without requiring an intermediate phonetic representation. The system is based on a …

被引用次数：3070 相关文章所有 10 个版本

[PDF] arxiv.org

A comparison of modeling units in sequence-to-sequence speech recognition with the transformer on mandarin chinese

S Zhou, L Dong, S Xu, B Xu - International Conference on Neural …, 2018 - Springer

The choice of modeling units is critical to automatic speech recognition (ASR) tasks.
Conventional ASR systems typically choose context-dependent states (CD-states) or context …

被引用次数：72 相关文章所有 4 个版本

[PDF] isca-archive.org

[PDF][PDF] Lower Frame Rate Neural Network Acoustic Models.

G Pundak, TN Sainath - Interspeech, 2016 - isca-archive.org

Recently neural network acoustic models trained with Connectionist Temporal Classification
(CTC) were proposed as an alternative approach to conventional cross-entropy trained …

被引用次数：158 相关文章所有 8 个版本

[PDF] cmu.edu

An empirical exploration of CTC acoustic models

Y Miao, M Gowayyed, X Na, T Ko… - … on acoustics, speech …, 2016 - ieeexplore.ieee.org

The connectionist temporal classification (CTC) loss function has several interesting
properties relevant for automatic speech recognition (ASR): applied on top of deep recurrent …

被引用次数：107 相关文章所有 7 个版本

[PDF] isca-archive.org

[PDF][PDF] Recurrent neural network and LSTM models for lexical utterance classification.

SV Ravuri, A Stolcke - Interspeech, 2015 - isca-archive.org

Utterance classification is a critical pre-processing step for many speech understanding and
dialog systems. In multi-user settings, one needs to first identify if an utterance is even …

被引用次数：258 相关文章所有 8 个版本

[PDF][PDF] A Comparison of sequence-to-sequence models for speech recognition.

[PDF][PDF] Recurrent neural aligner: An encoder-decoder neural network model for sequence to sequence mapping.

Exploring architectures, data and units for streaming end-to-end speech recognition with rnn-transducer

Multi-accent speech recognition with hierarchical grapheme based models

Direct acoustics-to-word models for english conversational speech recognition

Towards end-to-end speech recognition with recurrent neural networks

A comparison of modeling units in sequence-to-sequence speech recognition with the transformer on mandarin chinese

[PDF][PDF] Lower Frame Rate Neural Network Acoustic Models.

An empirical exploration of CTC acoustic models

[PDF][PDF] Recurrent neural network and LSTM models for lexical utterance classification.

相关搜索

高级搜索

引用