Subspace Gaussian mixture models for speech recognition

GM Harshvardhan, MK Gourisaria, M Pandey… - Computer Science …, 2020 - Elsevier

Generative models have been in existence for many decades. In the field of machine
learning, we come across many scenarios when directly learning a target is intractable …

被引用次数：330 相关文章所有 2 个版本

[PDF] kresttechnology.com

An overview of noise-robust automatic speech recognition

J Li, L Deng, Y Gong… - IEEE/ACM Transactions …, 2014 - ieeexplore.ieee.org

New waves of consumer-centric applications, such as voice search and voice interaction
with mobile devices and home entertainment systems, increasingly require automatic …

被引用次数：665 相关文章所有 9 个版本

[PDF] ucl.ac.uk

Deepear: robust smartphone audio sensing in unconstrained acoustic environments using deep learning

ND Lane, P Georgiev, L Qendro - … of the 2015 ACM international joint …, 2015 - dl.acm.org

Microphones are remarkably powerful sensors of human behavior and context. However,
audio sensing is highly susceptible to wild fluctuations in accuracy when used in diverse …

被引用次数：408 相关文章所有 7 个版本

[PDF] epfl.ch

The subspace Gaussian mixture model—A structured model for speech recognition

D Povey, L Burget, M Agarwal, P Akyazi, F Kai… - Computer Speech & …, 2011 - Elsevier

We describe a new approach to speech recognition, in which all Hidden Markov Model
(HMM) states share the same Gaussian Mixture Model (GMM) structure with the same …

被引用次数：391 相关文章所有 21 个版本

Machine learning in automatic speech recognition: A survey

J Padmanabhan… - IETE Technical Review, 2015 - Taylor & Francis

Over the past few decades, there has been tremendous development in machine learning
paradigms used in automatic speech recognition (ASR) for home automation to space …

被引用次数：192 相关文章

[PDF] danielpovey.com

Minimum bayes risk decoding and system combination based on a recursion for edit distance

H Xu, D Povey, L Mangu, J Zhu - Computer Speech & Language, 2011 - Elsevier

In this paper we describe a method that can be used for Minimum Bayes Risk (MBR)
decoding for speech recognition. Our algorithm can take as input either a single lattice, or …

被引用次数：205 相关文章所有 12 个版本

[PDF] sigport.org

Revisiting hidden Markov models for speech emotion recognition

S Mao, D Tao, G Zhang, PC Ching… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org

Hidden Markov models (HMMs) have a long tradition in automatic speech recognition (ASR)
due to their capability of capturing temporal dynamic characteristics of speech. For emotion …

被引用次数：82 相关文章所有 2 个版本

[PDF] psu.edu

Multilingual acoustic modeling for speech recognition based on subspace Gaussian mixture models

L Burget, P Schwarz, M Agarwal… - … on acoustics, speech …, 2010 - ieeexplore.ieee.org

Although research has previously been done on multilingual speech recognition, it has been
found to be very difficult to improve over separately trained systems. The usual approach …

被引用次数：209 相关文章所有 15 个版本

Multitask learning of deep neural networks for low-resource speech recognition

D Chen, BKW Mak - IEEE/ACM Transactions on Audio, Speech …, 2015 - ieeexplore.ieee.org

We propose a multitask learning (MTL) approach to improve low-resource automatic speech
recognition using deep neural networks (DNNs) without requiring additional language …

被引用次数：127 相关文章所有 4 个版本

[PDF] mit.edu

Spoken content retrieval—beyond cascading speech recognition with text retrieval

L Lee, J Glass, H Lee, C Chan - IEEE/ACM Transactions on …, 2015 - ieeexplore.ieee.org

Spoken content retrieval refers to directly indexing and retrieving spoken content based on
the audio rather than text descriptions. This potentially eliminates the requirement of …

被引用次数：129 相关文章所有 12 个版本