Asynchronous stochastic optimization for sequence training of deep neural networks

M Hassanin, S Anwar, I Radwan, FS Khan, A Mian - Information Fusion, 2024 - Elsevier

Inspired by the human cognitive system, attention is a mechanism that imitates the human
cognitive awareness about specific information, amplifying critical details to focus more on …

被引用次数：144 相关文章所有 7 个版本

[PDF] fardapaper.ir

A survey on deep learning for big data

Q Zhang, LT Yang, Z Chen, P Li - Information Fusion, 2018 - Elsevier

Deep learning, as one of the most currently remarkable machine learning techniques, has
achieved great success in many applications such as image analysis, speech recognition …

被引用次数：1346 相关文章所有 6 个版本

[PDF] google.com

Convolutional, long short-term memory, fully connected deep neural networks

TN Sainath, O Vinyals, A Senior… - 2015 IEEE international …, 2015 - ieeexplore.ieee.org

Both Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) have
shown improvements over Deep Neural Networks (DNNs) across a wide variety of speech …

被引用次数：1967 相关文章所有 10 个版本

[PDF] isca-archive.org

[PDF][PDF] Learning the speech front-end with raw waveform CLDNNs.

TN Sainath, RJ Weiss, AW Senior, KW Wilson… - Interspeech, 2015 - isca-archive.org

Learning an acoustic model directly from the raw waveform has been an active area of
research. However, waveformbased models have not yet matched the performance of …

被引用次数：626 相关文章所有 10 个版本

[PDF] arxiv.org

Highway long short-term memory rnns for distant speech recognition

Y Zhang, G Chen, D Yu, K Yao… - … on acoustics, speech …, 2016 - ieeexplore.ieee.org

In this paper, we extend the deep long short-term memory (DL-STM) recurrent neural
networks by introducing gated direct connections between memory cells in adjacent layers …

被引用次数：369 相关文章所有 12 个版本

Multichannel signal processing with deep neural networks for automatic speech recognition

TN Sainath, RJ Weiss, KW Wilson, B Li… - … on Audio, Speech …, 2017 - ieeexplore.ieee.org

Multichannel automatic speech recognition (ASR) systems commonly separate speech
enhancement, including localization, beamforming, and postfiltering, from acoustic …

被引用次数：282 相关文章所有 5 个版本

[PDF] arxiv.org

Sparse overcomplete word vector representations

M Faruqui, Y Tsvetkov, D Yogatama, C Dyer… - arXiv preprint arXiv …, 2015 - arxiv.org

Current distributed representations of words show little resemblance to theories of lexical
semantics. The former are dense and uninterpretable, the latter largely based on familiar …

被引用次数：238 相关文章所有 7 个版本

[PDF] microsoft.com

Scalable training of deep learning machines by incremental block training with intra-block parallel optimization and blockwise model-update filtering

K Chen, Q Huo - … conference on acoustics, speech and signal …, 2016 - ieeexplore.ieee.org

We present a new approach to scalable training of deep learning machines by incremental
block training with intra-block parallel optimization to leverage data parallelism and …

被引用次数：167 相关文章所有 4 个版本

[PDF] isca-archive.org

[PDF][PDF] Lower Frame Rate Neural Network Acoustic Models.

G Pundak, TN Sainath - Interspeech, 2016 - isca-archive.org

Recently neural network acoustic models trained with Connectionist Temporal Classification
(CTC) were proposed as an alternative approach to conventional cross-entropy trained …

被引用次数：158 相关文章所有 8 个版本

[PDF] isca-archive.org

[PDF][PDF] Neural network adaptive beamforming for robust multichannel speech recognition.

B Li, TN Sainath, RJ Weiss, KW Wilson, M Bacchiani - Interspeech, 2016 - isca-archive.org

Joint multichannel enhancement and acoustic modeling using neural networks has shown
promise over the past few years. However, one shortcoming of previous work [1, 2, 3] is that …

被引用次数：152 相关文章所有 12 个版本