Sparse hidden Markov models for speech enhancement in non-stationary noise environments

F Deng, C Bao, WB Kleijn - IEEE/ACM Transactions on Audio …, 2015 - ieeexplore.ieee.org
We propose a sparse hidden Markov model (HMM)-based single-channel speech
enhancement method that models the speech and noise gains accurately in non-stationary …

A DNN-HMM-DNN hybrid model for discovering word-like units from spoken captions and image regions

L Wang, M Hasegawa-Johnson - Interspeech, 2020 - par.nsf.gov
Discovering word-like units without textual transcriptions is an important step in low-resource
speech technology. In this work, we demonstrate a model inspired by statistical machine …

[PDF][PDF] Multimodal Word Discovery and Retrieval with Phone Sequence and Image Concepts.

L Wang, MA Hasegawa-Johnson - INTERSPEECH, 2019 - isca-archive.org
This paper demonstrates three different systems capable of performing the multimodal word
discovery task. A multimodal word discovery system accepts, as input, a database of spoken …

Sparse HMM-based speech enhancement method for stationary and non-stationary noise environments

F Deng, C Bao, WB Kleijn - 2015 IEEE International …, 2015 - ieeexplore.ieee.org
We propose a sparse hidden Markov model (HMM)-based single-channel speech
enhancement method that models the speech and noise gains accurately in both stationary …

[PDF][PDF] A PAC-Bayesian approach to minimum perplexity language modeling

S Bharadwaj… - Proceedings of COLING …, 2014 - aclanthology.org
Despite the overwhelming use of statistical language models in speech recognition,
machine translation, and several other domains, few high probability guarantees exist on …

Multimodal word discovery and retrieval with spoken descriptions and visual concepts

L Wang, M Hasegawa-Johnson - IEEE/ACM Transactions on …, 2020 - ieeexplore.ieee.org
In the absence of dictionaries, translators, or grammars, it is still possible to learn some of
the words of a new language by listening to spoken descriptions of images. If several …

[图书][B] A theory of (almost) zero resource speech recognition

SS Bharadwaj - 2015 - search.proquest.com
Automatic speech recognition has matured into a commercially successful technology,
enabling voice-based interfaces for smartphones, smart TVs, and many other consumer …

Dynamics on networks

L Wang - 2020 - ideals.illinois.edu
Abstract" The main focus of this thesis is to study the stability of fix points for a dynamical
system. In the first part, we consider two dynamical models whose underlying graph can be …

An Approach for Speech Recognition Technique

MS Basha, BR Subbaiah… - i-manager's Journal on …, 2014 - search.proquest.com
Abstract The design of Speech Recognition system has careful attention in the following
issues: classification of various types of speech classes, speech representation, and feature …

[PDF][PDF] Mark Hasegawa-Johnson

S Xi, PB Kappa - linguistics.illinois.edu
2. T. Taniguchi, MA Johnson, and Y. Ohta,“Multi-vector pitch-orthogonal LPC: quality speech
with low complexity at rates between 4 and 8 kbps,” ICSLP, Kobe, pp. 113-116, 1990. 3. MA …