Pronunciation and silence probability modeling for ASR.

[PDF][PDF] Performance evaluation of deep neural networks applied to speech recognition: RNN, LSTM and GRU

A Shewalkar, D Nyavanandi, SA Ludwig - Journal of Artificial …, 2019 - sciendo.com

Abstract Deep Neural Networks (DNN) are nothing but neural networks with many hidden
layers. DNNs are becoming popular in automatic speech recognition tasks which combines …

被引用次数：443 相关文章所有 9 个版本

[PDF] academia.edu

[PDF][PDF] Semi-orthogonal low-rank matrix factorization for deep neural networks.

D Povey, G Cheng, Y Wang, K Li, H Xu… - Interspeech, 2018 - academia.edu

Abstract Time Delay Neural Networks (TDNNs), also known as onedimensional
Convolutional Neural Networks (1-d CNNs), are an efficient and well-performing neural …

被引用次数：605 相关文章所有 9 个版本

[PDF] isca-archive.org

[PDF][PDF] Purely sequence-trained neural networks for ASR based on lattice-free MMI.

D Povey, V Peddinti, D Galvez, P Ghahremani… - Interspeech, 2016 - isca-archive.org

In this paper we describe a method to perform sequencediscriminative training of neural
network acoustic models without the need for frame-level cross-entropy pre-training. We use …

被引用次数：989 相关文章所有 14 个版本

[PDF] isca-archive.org

[PDF][PDF] A time delay neural network architecture for efficient modeling of long temporal contexts.

V Peddinti, D Povey, S Khudanpur - Interspeech, 2015 - isca-archive.org

Recurrent neural network architectures have been shown to efficiently model long term
temporal dependencies between acoustic events. However the training time of recurrent …

被引用次数：1321 相关文章所有 13 个版本

[PDF] danielpovey.com

A pruned rnnlm lattice-rescoring algorithm for automatic speech recognition

H Xu, T Chen, D Gao, Y Wang, K Li… - … on acoustics, speech …, 2018 - ieeexplore.ieee.org

Lattice-rescoring is a common approach to take advantage of recurrent neural language
models in ASR, where a word-lattice is generated from 1st-pass decoding and the lattice is …

被引用次数：136 相关文章所有 11 个版本

[PDF] danielpovey.com

Jhu aspire system: Robust lvcsr with tdnns, ivector adaptation and rnn-lms

V Peddinti, G Chen, V Manohar, T Ko… - … IEEE Workshop on …, 2015 - ieeexplore.ieee.org

Multi-style training, using data which emulates a variety of possible test scenarios, is a
popular approach towards robust acoustic modeling. However acoustic models capable of …

被引用次数：131 相关文章所有 7 个版本

[PDF] arxiv.org

Building state-of-the-art distant speech recognition using the CHiME-4 challenge with a setup of speech enhancement baseline

SJ Chen, AS Subramanian, H Xu… - arXiv preprint arXiv …, 2018 - arxiv.org

This paper describes a new baseline system for automatic speech recognition (ASR) in the
CHiME-4 challenge to promote the development of noisy ASR in speech processing …

被引用次数：84 相关文章所有 9 个版本

[PDF] danielpovey.com

Neural network language modeling with letter-based features and importance sampling

H Xu, K Li, Y Wang, J Wang, S Kang… - … on acoustics, speech …, 2018 - ieeexplore.ieee.org

In this paper we describe an extension of the Kaldi software toolkit to support neural-based
language modeling, intended for use in automatic speech recognition (ASR) and related …

被引用次数：86 相关文章所有 11 个版本

[PDF] arxiv.org

Wake word detection with streaming transformers

Y Wang, H Lv, D Povey, L Xie… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

Modern wake word detection systems usually rely on neural networks for acoustic modeling.
Transformers has recently shown superior performance over LSTM and convolutional …

被引用次数：36 相关文章所有 9 个版本

[PDF] danielpovey.com

[PDF][PDF] Recurrent neural network language model adaptation for conversational speech recognition.

K Li, H Xu, Y Wang, D Povey, S Khudanpur - Interspeech, 2018 - danielpovey.com

We propose two adaptation models for recurrent neural network language models
(RNNLMs) to capture topic effects and longdistance triggers for conversational automatic …

被引用次数：68 相关文章所有 12 个版本