A comprehensive survey and analysis of generative models in machine learning

GM Harshvardhan, MK Gourisaria, M Pandey… - Computer Science …, 2020 - Elsevier
Generative models have been in existence for many decades. In the field of machine
learning, we come across many scenarios when directly learning a target is intractable …

An overview of noise-robust automatic speech recognition

J Li, L Deng, Y Gong… - IEEE/ACM Transactions …, 2014 - ieeexplore.ieee.org
New waves of consumer-centric applications, such as voice search and voice interaction
with mobile devices and home entertainment systems, increasingly require automatic …

Deepear: robust smartphone audio sensing in unconstrained acoustic environments using deep learning

ND Lane, P Georgiev, L Qendro - … of the 2015 ACM international joint …, 2015 - dl.acm.org
Microphones are remarkably powerful sensors of human behavior and context. However,
audio sensing is highly susceptible to wild fluctuations in accuracy when used in diverse …

The subspace Gaussian mixture model—A structured model for speech recognition

D Povey, L Burget, M Agarwal, P Akyazi, F Kai… - Computer Speech & …, 2011 - Elsevier
We describe a new approach to speech recognition, in which all Hidden Markov Model
(HMM) states share the same Gaussian Mixture Model (GMM) structure with the same …

Machine learning in automatic speech recognition: A survey

J Padmanabhan… - IETE Technical Review, 2015 - Taylor & Francis
Over the past few decades, there has been tremendous development in machine learning
paradigms used in automatic speech recognition (ASR) for home automation to space …

Minimum bayes risk decoding and system combination based on a recursion for edit distance

H Xu, D Povey, L Mangu, J Zhu - Computer Speech & Language, 2011 - Elsevier
In this paper we describe a method that can be used for Minimum Bayes Risk (MBR)
decoding for speech recognition. Our algorithm can take as input either a single lattice, or …

Revisiting hidden Markov models for speech emotion recognition

S Mao, D Tao, G Zhang, PC Ching… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org
Hidden Markov models (HMMs) have a long tradition in automatic speech recognition (ASR)
due to their capability of capturing temporal dynamic characteristics of speech. For emotion …

Multilingual acoustic modeling for speech recognition based on subspace Gaussian mixture models

L Burget, P Schwarz, M Agarwal… - … on acoustics, speech …, 2010 - ieeexplore.ieee.org
Although research has previously been done on multilingual speech recognition, it has been
found to be very difficult to improve over separately trained systems. The usual approach …

Multitask learning of deep neural networks for low-resource speech recognition

D Chen, BKW Mak - IEEE/ACM Transactions on Audio, Speech …, 2015 - ieeexplore.ieee.org
We propose a multitask learning (MTL) approach to improve low-resource automatic speech
recognition using deep neural networks (DNNs) without requiring additional language …

Spoken content retrieval—beyond cascading speech recognition with text retrieval

L Lee, J Glass, H Lee, C Chan - IEEE/ACM Transactions on …, 2015 - ieeexplore.ieee.org
Spoken content retrieval refers to directly indexing and retrieving spoken content based on
the audio rather than text descriptions. This potentially eliminates the requirement of …