Audio-visual speech recognition using deep learning
Audio-visual speech recognition (AVSR) system is thought to be one of the most promising
solutions for reliable speech recognition, particularly when the audio is corrupted by noise …
solutions for reliable speech recognition, particularly when the audio is corrupted by noise …
Weakly supervised learning with multi-stream CNN-LSTM-HMMs to discover sequential parallelism in sign language videos
In this work we present a new approach to the field of weakly supervised learning in the
video domain. Our method is relevant to sequence learning problems which can be split up …
video domain. Our method is relevant to sequence learning problems which can be split up …
Method and configuration for determining a descriptive feature of a speech signal
M Holzapfel - US Patent 6,523,005, 2003 - Google Patents
A method and also a configuration for determining a descriptive feature of a speech signal,
in which a first speech model is trained with a first time pattern and a second speech model …
in which a first speech model is trained with a first time pattern and a second speech model …
Recent advances in the multi-stream HMM/ANN hybrid approach to noise robust ASR
A Hagen, A Morris - Computer Speech & Language, 2005 - Elsevier
In this article we review several successful extensions to the standard hidden-Markov-
model/artificial neural network (HMM/ANN) hybrid, which have recently made important …
model/artificial neural network (HMM/ANN) hybrid, which have recently made important …
Subband-based speech recognition
H Bourlard, S Dupont - 1997 IEEE International Conference on …, 1997 - ieeexplore.ieee.org
In the framework of hidden Markov models (HMM) or hybrid HMM/artificial neural network
(ANN) systems, we present a new approach towards automatic speech recognition (ASR) …
(ANN) systems, we present a new approach towards automatic speech recognition (ASR) …
A multiobjective learning and ensembling approach to high-performance speech enhancement with compact neural network architectures
In this study, we propose a novel deep neural network (DNN) architecture for speech
enhancement (SE) via a multiobjective learning and ensembling (MOLE) framework to …
enhancement (SE) via a multiobjective learning and ensembling (MOLE) framework to …
Some solution to the missing feature problem in data classification, with application to noise robust ASR
We address the theoretical and practical issues involved in automatic speech recognition
(ASR) when some of the observation data for the target signal is masked by other signals …
(ASR) when some of the observation data for the target signal is masked by other signals …
[PDF][PDF] Opportunities and challenges of parallelizing speech recognition
Automatic speech recognition enables a wide range of current and emerging applications
such as automatic transcription, multimedia content analysis, and natural human-computer …
such as automatic transcription, multimedia content analysis, and natural human-computer …
An HMM-based framework for video semantic analysis
G Xu, YF Ma, HJ Zhang, SQ Yang - IEEE Transactions on …, 2005 - ieeexplore.ieee.org
Video semantic analysis is essential in video indexing and structuring. However, due to the
lack of robust and generic algorithms, most of the existing works on semantic analysis are …
lack of robust and generic algorithms, most of the existing works on semantic analysis are …
An articulatory feature-based tandem approach and factored observation modeling
O Cetin, A Kantor, S King, C Bartels… - … , Speech and Signal …, 2007 - ieeexplore.ieee.org
The so-called tandem approach, where the posteriors of a multilayer perceptron (MLP)
classifier are used as features in an automatic speech recognition (ASR) system has proven …
classifier are used as features in an automatic speech recognition (ASR) system has proven …