Visual attention methods in deep learning: An in-depth survey
Inspired by the human cognitive system, attention is a mechanism that imitates the human
cognitive awareness about specific information, amplifying critical details to focus more on …
cognitive awareness about specific information, amplifying critical details to focus more on …
A survey on deep learning for big data
Deep learning, as one of the most currently remarkable machine learning techniques, has
achieved great success in many applications such as image analysis, speech recognition …
achieved great success in many applications such as image analysis, speech recognition …
Convolutional, long short-term memory, fully connected deep neural networks
Both Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) have
shown improvements over Deep Neural Networks (DNNs) across a wide variety of speech …
shown improvements over Deep Neural Networks (DNNs) across a wide variety of speech …
[PDF][PDF] Learning the speech front-end with raw waveform CLDNNs.
Learning an acoustic model directly from the raw waveform has been an active area of
research. However, waveformbased models have not yet matched the performance of …
research. However, waveformbased models have not yet matched the performance of …
Highway long short-term memory rnns for distant speech recognition
In this paper, we extend the deep long short-term memory (DL-STM) recurrent neural
networks by introducing gated direct connections between memory cells in adjacent layers …
networks by introducing gated direct connections between memory cells in adjacent layers …
Multichannel signal processing with deep neural networks for automatic speech recognition
Multichannel automatic speech recognition (ASR) systems commonly separate speech
enhancement, including localization, beamforming, and postfiltering, from acoustic …
enhancement, including localization, beamforming, and postfiltering, from acoustic …
Sparse overcomplete word vector representations
Current distributed representations of words show little resemblance to theories of lexical
semantics. The former are dense and uninterpretable, the latter largely based on familiar …
semantics. The former are dense and uninterpretable, the latter largely based on familiar …
Scalable training of deep learning machines by incremental block training with intra-block parallel optimization and blockwise model-update filtering
K Chen, Q Huo - … conference on acoustics, speech and signal …, 2016 - ieeexplore.ieee.org
We present a new approach to scalable training of deep learning machines by incremental
block training with intra-block parallel optimization to leverage data parallelism and …
block training with intra-block parallel optimization to leverage data parallelism and …
[PDF][PDF] Lower Frame Rate Neural Network Acoustic Models.
G Pundak, TN Sainath - Interspeech, 2016 - isca-archive.org
Recently neural network acoustic models trained with Connectionist Temporal Classification
(CTC) were proposed as an alternative approach to conventional cross-entropy trained …
(CTC) were proposed as an alternative approach to conventional cross-entropy trained …
[PDF][PDF] Neural network adaptive beamforming for robust multichannel speech recognition.
Joint multichannel enhancement and acoustic modeling using neural networks has shown
promise over the past few years. However, one shortcoming of previous work [1, 2, 3] is that …
promise over the past few years. However, one shortcoming of previous work [1, 2, 3] is that …