Small-footprint high-performance deep neural network-based speech recognition using split-VQ

M Ravanelli, P Brakel, M Omologo… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org

A field that has directly benefited from the recent advances in deep learning is automatic
speech recognition (ASR). Despite the great achievements of the past decades, however, a …

被引用次数：401 相关文章所有 7 个版本

Survey on machine learning in speech emotion recognition and vision systems using a recurrent neural network (RNN)

SP Yadav, S Zaidi, A Mishra, V Yadav - Archives of Computational …, 2022 - Springer

This is a survey paper that aims to give reviews about that finest architectures of machine
learning, the use of algorithms and the applications of the system and speech and vision …

被引用次数：91 相关文章所有 2 个版本

[HTML] nih.gov

Survey on deep neural networks in speech and vision systems

M Alam, MD Samad, L Vidyaratne, A Glandon… - Neurocomputing, 2020 - Elsevier

This survey presents a review of state-of-the-art deep neural network architectures,
algorithms, and systems in speech and vision applications. Recent advances in deep …

被引用次数：221 相关文章所有 8 个版本

[PDF] ieee.org

Recent progresses in deep learning based acoustic models

D Yu, J Li - IEEE/CAA Journal of automatica sinica, 2017 - ieeexplore.ieee.org

In this paper, we summarize recent progresses made in deep learning based acoustic
models and the motivation and insights behind the surveyed techniques. We first discuss …

被引用次数：190 相关文章所有 7 个版本

[PDF] arxiv.org

Personalized speech recognition on mobile devices

I McGraw, R Prabhavalkar, R Alvarez… - … , Speech and Signal …, 2016 - ieeexplore.ieee.org

We describe a large vocabulary speech recognition system that is accurate, has low latency,
and yet has a small enough memory and computational footprint to run faster than real-time …

被引用次数：221 相关文章所有 8 个版本

[PDF] arxiv.org

On the compression of recurrent neural networks with an application to LVCSR acoustic modeling for embedded speech recognition

R Prabhavalkar, O Alsharif, A Bruguier… - … on Acoustics, Speech …, 2016 - ieeexplore.ieee.org

We study the problem of compressing recurrent neural networks (RNNs). In particular, we
focus on the compression of RNN acoustic models, which are motivated by the goal of …

被引用次数：119 相关文章所有 8 个版本

[HTML] amazon.science

[HTML][HTML] Model compression applied to small-footprint keyword spotting

G Tucker, M Wu, M Sun, S Panchapagesan, G Fu… - 2016 - amazon.science

Several consumer speech devices feature voice interfaces that perform on-device keyword
spotting to initiate user interactions. Accurate on-device keyword spotting within a tight CPU …

被引用次数：105 相关文章所有 6 个版本

Compressing CNN-DBLSTM models for OCR with teacher-student learning and Tucker decomposition

H Ding, K Chen, Q Huo - Pattern Recognition, 2019 - Elsevier

Integrated convolutional neural network (CNN) and deep bidirectional long short-term
memory (DBLSTM) based character models have achieved excellent recognition accuracies …

被引用次数：39 相关文章所有 3 个版本

[PDF] isca-archive.org

[PDF][PDF] On Online Attention-Based Speech Recognition and Joint Mandarin Character-Pinyin Training.

W Chan, IR Lane - Interspeech, 2016 - isca-archive.org

In this paper, we explore the use of attention-based models for online speech recognition
without the usage of language models or searching. Our model is based on an attention …

被引用次数：55 相关文章所有 5 个版本

[PDF] hep.com.cn

Binary neural networks for speech recognition

Y Qian, X Xiang - Frontiers of Information Technology & Electronic …, 2019 - Springer

Recently, deep neural networks (DNNs) significantly outperform Gaussian mixture models in
acoustic modeling for speech recognition. However, the substantial increase in …

被引用次数：27 相关文章所有 4 个版本