Light gated recurrent units for speech recognition

M Ravanelli, P Brakel, M Omologo… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
A field that has directly benefited from the recent advances in deep learning is automatic
speech recognition (ASR). Despite the great achievements of the past decades, however, a …

Survey on machine learning in speech emotion recognition and vision systems using a recurrent neural network (RNN)

SP Yadav, S Zaidi, A Mishra, V Yadav - Archives of Computational …, 2022 - Springer
This is a survey paper that aims to give reviews about that finest architectures of machine
learning, the use of algorithms and the applications of the system and speech and vision …

Survey on deep neural networks in speech and vision systems

M Alam, MD Samad, L Vidyaratne, A Glandon… - Neurocomputing, 2020 - Elsevier
This survey presents a review of state-of-the-art deep neural network architectures,
algorithms, and systems in speech and vision applications. Recent advances in deep …

Recent progresses in deep learning based acoustic models

D Yu, J Li - IEEE/CAA Journal of automatica sinica, 2017 - ieeexplore.ieee.org
In this paper, we summarize recent progresses made in deep learning based acoustic
models and the motivation and insights behind the surveyed techniques. We first discuss …

Personalized speech recognition on mobile devices

I McGraw, R Prabhavalkar, R Alvarez… - … , Speech and Signal …, 2016 - ieeexplore.ieee.org
We describe a large vocabulary speech recognition system that is accurate, has low latency,
and yet has a small enough memory and computational footprint to run faster than real-time …

On the compression of recurrent neural networks with an application to LVCSR acoustic modeling for embedded speech recognition

R Prabhavalkar, O Alsharif, A Bruguier… - … on Acoustics, Speech …, 2016 - ieeexplore.ieee.org
We study the problem of compressing recurrent neural networks (RNNs). In particular, we
focus on the compression of RNN acoustic models, which are motivated by the goal of …

[HTML][HTML] Model compression applied to small-footprint keyword spotting

G Tucker, M Wu, M Sun, S Panchapagesan, G Fu… - 2016 - amazon.science
Several consumer speech devices feature voice interfaces that perform on-device keyword
spotting to initiate user interactions. Accurate on-device keyword spotting within a tight CPU …

Compressing CNN-DBLSTM models for OCR with teacher-student learning and Tucker decomposition

H Ding, K Chen, Q Huo - Pattern Recognition, 2019 - Elsevier
Integrated convolutional neural network (CNN) and deep bidirectional long short-term
memory (DBLSTM) based character models have achieved excellent recognition accuracies …

[PDF][PDF] On Online Attention-Based Speech Recognition and Joint Mandarin Character-Pinyin Training.

W Chan, IR Lane - Interspeech, 2016 - isca-archive.org
In this paper, we explore the use of attention-based models for online speech recognition
without the usage of language models or searching. Our model is based on an attention …

Binary neural networks for speech recognition

Y Qian, X Xiang - Frontiers of Information Technology & Electronic …, 2019 - Springer
Recently, deep neural networks (DNNs) significantly outperform Gaussian mixture models in
acoustic modeling for speech recognition. However, the substantial increase in …