Deep learning in neural networks: An overview

J Schmidhuber - Neural networks, 2015 - Elsevier
In recent years, deep artificial neural networks (including recurrent ones) have won
numerous contests in pattern recognition and machine learning. This historical survey …

Speech emotion recognition: a comprehensive survey

MJ Al-Dujaili, A Ebrahimi-Moghadam - Wireless Personal Communications, 2023 - Springer
Speech emotion recognition could be considered a new topic in speech processing where
he plays that plays an essential role in human interaction. Emotions are a king of speech …

Deep learning for audio signal processing

H Purwins, B Li, T Virtanen, J Schlüter… - IEEE Journal of …, 2019 - ieeexplore.ieee.org
Given the recent surge in developments of deep learning, this paper provides a review of the
state-of-the-art deep learning techniques for audio signal processing. Speech, music, and …

[HTML][HTML] Deep speech 2: End-to-end speech recognition in english and mandarin

D Amodei, S Ananthanarayanan… - International …, 2016 - proceedings.mlr.press
We show that an end-to-end deep learning approach can be used to recognize either
English or Mandarin Chinese speech–two vastly different languages. Because it replaces …

Listen, attend and spell: A neural network for large vocabulary conversational speech recognition

W Chan, N Jaitly, Q Le, O Vinyals - 2016 IEEE international …, 2016 - ieeexplore.ieee.org
We present Listen, Attend and Spell (LAS), a neural speech recognizer that transcribes
speech utterances directly to characters without pronunciation models, HMMs or other …

Long-term recurrent convolutional networks for visual recognition and description

J Donahue, L Anne Hendricks… - Proceedings of the …, 2015 - openaccess.thecvf.com
Abstract Models comprised of deep convolutional network layers have dominated recent
image interpretation tasks; we investigate whether models which are also compositional, or" …

Exploring architectures, data and units for streaming end-to-end speech recognition with rnn-transducer

K Rao, H Sak, R Prabhavalkar - 2017 IEEE automatic speech …, 2017 - ieeexplore.ieee.org
We investigate training end-to-end speech recognition models with the recurrent neural
network transducer (RNN-T): a streaming, all-neural, sequence-to-sequence architecture …

Sequence-to-sequence learning as beam-search optimization

S Wiseman, AM Rush - arXiv preprint arXiv:1606.02960, 2016 - arxiv.org
Sequence-to-Sequence (seq2seq) modeling has rapidly become an important general-
purpose NLP tool that has proven effective for many text-generation and sequence-labeling …

[图书][B] Automatic speech recognition

D Yu, L Deng - 2016 - Springer
Automatic Speech Recognition (ASR), which is aimed to enable natural human–machine
interaction, has been an intensive research area for decades. Many core technologies, such …

Parallel recurrent neural network architectures for feature-rich session-based recommendations

B Hidasi, M Quadrana, A Karatzoglou… - Proceedings of the 10th …, 2016 - dl.acm.org
Real-life recommender systems often face the daunting task of providing recommendations
based only on the clicks of a user session. Methods that rely on user profiles--such as matrix …