[PDF][PDF] Generation of Large-Scale Simulated Utterances in Virtual Rooms to Train Deep-Neural Networks for Far-Field Speech Recognition in Google Home.
We describe the structure and application of an acoustic room simulator to generate large-
scale simulated data for training deep neural networks for far-field speech recognition. The …
scale simulated data for training deep neural networks for far-field speech recognition. The …
[PDF][PDF] Signal separation for robust speech recognition based on phase difference information obtained in the frequency domain.
In this paper, we present a new two-microphone approach that improves speech recognition
accuracy when speech is masked by other speech. The algorithm improves on previous …
accuracy when speech is masked by other speech. The algorithm improves on previous …
Recurrent models for auditory attention in multi-microphone distance speech recognition
Integration of multiple microphone data is one of the key ways to achieve robust speech
recognition in noisy environments or when the speaker is located at some distance from the …
recognition in noisy environments or when the speaker is located at some distance from the …
[PDF][PDF] Signal processing for robust speech recognition motivated by auditory processing
C Kim - Ph. D. dissertation, 2010 - lti.cmu.edu
Although automatic speech recognition systems have dramatically improved in recent
decades, speech recognition accuracy still significantly degrades in noisy environments …
decades, speech recognition accuracy still significantly degrades in noisy environments …
[PDF][PDF] Robust speech recognition using temporal masking and thresholding algorithm.
In this paper, we present a new dereverberation algorithm called Temporal Masking and
Thresholding (TMT) to enhance the temporal spectra of spectral features for robust speech …
Thresholding (TMT) to enhance the temporal spectra of spectral features for robust speech …
Binaural sound source separation motivated by auditory processing
In this paper we present a new method of signal processing for robust speech recognition
using two microphones. The method, loosely based on the human binaural hearing system …
using two microphones. The method, loosely based on the human binaural hearing system …
Adjustable Coherent-to-Diffuse Power Estimator for Binaural Speech Enhancement in Multi-Talker Environments
The binaural coherence-to-diffuse power ratio (CDR) estimate in reverberant environments
is essential in many speech enhancement algorithms applied within hear-through systems …
is essential in many speech enhancement algorithms applied within hear-through systems …
Sound source separation using phase difference and reliable mask selection selection
C Kim, A Menon, M Bacchiani… - 2018 IEEE International …, 2018 - ieeexplore.ieee.org
In this paper, we present an algorithm called Reliable Mask Selection-Phase Difference
Channel Weighting (RMS-PDCW) which selects the target source masked by a noise source …
Channel Weighting (RMS-PDCW) which selects the target source masked by a noise source …
Multiple speaker tracking using a microphone array by combining auditory processing and a gaussian mixture cardinalized probability hypothesis density filter
Tracking speakers is an important application in smart environments. Acoustic tracking using
microphone arrays is a challenging task due to two major reasons: On the one hand …
microphone arrays is a challenging task due to two major reasons: On the one hand …
[PDF][PDF] Sound source separation using phase difference and reliable mask selection
C Kim, A Menon, M Bacchiani… - … on Acoustics, Speech …, 2018 - research.google.com
In this paper, we present an algorithm called Reliable Mask Selection-Phase Difference
Channel Weighting (RMS-PDCW) which selects the target source masked by a noise source …
Channel Weighting (RMS-PDCW) which selects the target source masked by a noise source …