[PDF][PDF] Performance evaluation of deep neural networks applied to speech recognition: RNN, LSTM and GRU
A Shewalkar, D Nyavanandi, SA Ludwig - Journal of Artificial …, 2019 - sciendo.com
Abstract Deep Neural Networks (DNN) are nothing but neural networks with many hidden
layers. DNNs are becoming popular in automatic speech recognition tasks which combines …
layers. DNNs are becoming popular in automatic speech recognition tasks which combines …
[PDF][PDF] Semi-orthogonal low-rank matrix factorization for deep neural networks.
Abstract Time Delay Neural Networks (TDNNs), also known as onedimensional
Convolutional Neural Networks (1-d CNNs), are an efficient and well-performing neural …
Convolutional Neural Networks (1-d CNNs), are an efficient and well-performing neural …
[PDF][PDF] Purely sequence-trained neural networks for ASR based on lattice-free MMI.
In this paper we describe a method to perform sequencediscriminative training of neural
network acoustic models without the need for frame-level cross-entropy pre-training. We use …
network acoustic models without the need for frame-level cross-entropy pre-training. We use …
[PDF][PDF] A time delay neural network architecture for efficient modeling of long temporal contexts.
Recurrent neural network architectures have been shown to efficiently model long term
temporal dependencies between acoustic events. However the training time of recurrent …
temporal dependencies between acoustic events. However the training time of recurrent …
A pruned rnnlm lattice-rescoring algorithm for automatic speech recognition
Lattice-rescoring is a common approach to take advantage of recurrent neural language
models in ASR, where a word-lattice is generated from 1st-pass decoding and the lattice is …
models in ASR, where a word-lattice is generated from 1st-pass decoding and the lattice is …
Jhu aspire system: Robust lvcsr with tdnns, ivector adaptation and rnn-lms
Multi-style training, using data which emulates a variety of possible test scenarios, is a
popular approach towards robust acoustic modeling. However acoustic models capable of …
popular approach towards robust acoustic modeling. However acoustic models capable of …
Building state-of-the-art distant speech recognition using the CHiME-4 challenge with a setup of speech enhancement baseline
This paper describes a new baseline system for automatic speech recognition (ASR) in the
CHiME-4 challenge to promote the development of noisy ASR in speech processing …
CHiME-4 challenge to promote the development of noisy ASR in speech processing …
Neural network language modeling with letter-based features and importance sampling
In this paper we describe an extension of the Kaldi software toolkit to support neural-based
language modeling, intended for use in automatic speech recognition (ASR) and related …
language modeling, intended for use in automatic speech recognition (ASR) and related …
Wake word detection with streaming transformers
Modern wake word detection systems usually rely on neural networks for acoustic modeling.
Transformers has recently shown superior performance over LSTM and convolutional …
Transformers has recently shown superior performance over LSTM and convolutional …
[PDF][PDF] Recurrent neural network language model adaptation for conversational speech recognition.
We propose two adaptation models for recurrent neural network language models
(RNNLMs) to capture topic effects and longdistance triggers for conversational automatic …
(RNNLMs) to capture topic effects and longdistance triggers for conversational automatic …