Sequence discriminative distributed training of long short-term memory recurrent neural networks

J Schmidhuber - Neural networks, 2015 - Elsevier

In recent years, deep artificial neural networks (including recurrent ones) have won
numerous contests in pattern recognition and machine learning. This historical survey …

被引用次数：23773 相关文章所有 42 个版本

Speech emotion recognition: a comprehensive survey

MJ Al-Dujaili, A Ebrahimi-Moghadam - Wireless Personal Communications, 2023 - Springer

Speech emotion recognition could be considered a new topic in speech processing where
he plays that plays an essential role in human interaction. Emotions are a king of speech …

被引用次数：51 相关文章所有 5 个版本

[PDF] arxiv.org

Deep learning for audio signal processing

H Purwins, B Li, T Virtanen, J Schlüter… - IEEE Journal of …, 2019 - ieeexplore.ieee.org

Given the recent surge in developments of deep learning, this paper provides a review of the
state-of-the-art deep learning techniques for audio signal processing. Speech, music, and …

被引用次数：909 相关文章所有 7 个版本

[HTML] mlr.press

[HTML][HTML] Deep speech 2: End-to-end speech recognition in english and mandarin

D Amodei, S Ananthanarayanan… - International …, 2016 - proceedings.mlr.press

We show that an end-to-end deep learning approach can be used to recognize either
English or Mandarin Chinese speech–two vastly different languages. Because it replaces …

被引用次数：3869 相关文章所有 13 个版本

[PDF] archive.org

Listen, attend and spell: A neural network for large vocabulary conversational speech recognition

W Chan, N Jaitly, Q Le, O Vinyals - 2016 IEEE international …, 2016 - ieeexplore.ieee.org

We present Listen, Attend and Spell (LAS), a neural speech recognizer that transcribes
speech utterances directly to characters without pronunciation models, HMMs or other …

被引用次数：2844 相关文章所有 7 个版本

[PDF] thecvf.com

Long-term recurrent convolutional networks for visual recognition and description

J Donahue, L Anne Hendricks… - Proceedings of the …, 2015 - openaccess.thecvf.com

Abstract Models comprised of deep convolutional network layers have dominated recent
image interpretation tasks; we investigate whether models which are also compositional, or" …

被引用次数：8005 相关文章所有 25 个版本

[PDF] arxiv.org

Exploring architectures, data and units for streaming end-to-end speech recognition with rnn-transducer

K Rao, H Sak, R Prabhavalkar - 2017 IEEE automatic speech …, 2017 - ieeexplore.ieee.org

We investigate training end-to-end speech recognition models with the recurrent neural
network transducer (RNN-T): a streaming, all-neural, sequence-to-sequence architecture …

被引用次数：421 相关文章所有 6 个版本

[PDF] arxiv.org

Sequence-to-sequence learning as beam-search optimization

S Wiseman, AM Rush - arXiv preprint arXiv:1606.02960, 2016 - arxiv.org

Sequence-to-Sequence (seq2seq) modeling has rapidly become an important general-
purpose NLP tool that has proven effective for many text-generation and sequence-labeling …

被引用次数：676 相关文章所有 8 个版本

[PDF] academia.edu

[图书][B] Automatic speech recognition

D Yu, L Deng - 2016 - Springer

Automatic Speech Recognition (ASR), which is aimed to enable natural human–machine
interaction, has been an intensive research area for decades. Many core technologies, such …

被引用次数：1602 相关文章所有 9 个版本

[PDF] github.io

Parallel recurrent neural network architectures for feature-rich session-based recommendations

B Hidasi, M Quadrana, A Karatzoglou… - Proceedings of the 10th …, 2016 - dl.acm.org

Real-life recommender systems often face the daunting task of providing recommendations
based only on the clicks of a user session. Methods that rely on user profiles--such as matrix …

被引用次数：591 相关文章所有 6 个版本