Binaural and multiple-microphone signal processing motivated by auditory perception

[PDF][PDF] Generation of Large-Scale Simulated Utterances in Virtual Rooms to Train Deep-Neural Networks for Far-Field Speech Recognition in Google Home.

C Kim, A Misra, KK Chin, T Hughes, A Narayanan… - …, 2017 - research.google.com

We describe the structure and application of an acoustic room simulator to generate large-
scale simulated data for training deep neural networks for far-field speech recognition. The …

被引用次数：276 相关文章所有 15 个版本

[PDF] isca-archive.org

[PDF][PDF] Signal separation for robust speech recognition based on phase difference information obtained in the frequency domain.

C Kim, K Kumar, B Raj, RM Stern - INTERSPEECH, 2009 - isca-archive.org

In this paper, we present a new two-microphone approach that improves speech recognition
accuracy when speech is masked by other speech. The algorithm improves on previous …

被引用次数：60 相关文章所有 11 个版本

[PDF] arxiv.org

Recurrent models for auditory attention in multi-microphone distance speech recognition

S Kim, I Lane - arXiv preprint arXiv:1511.06407, 2015 - arxiv.org

Integration of multiple microphone data is one of the key ways to achieve robust speech
recognition in noisy environments or when the speaker is located at some distance from the …

被引用次数：34 相关文章所有 9 个版本

[PDF] cmu.edu

[PDF][PDF] Signal processing for robust speech recognition motivated by auditory processing

C Kim - Ph. D. dissertation, 2010 - lti.cmu.edu

Although automatic speech recognition systems have dramatically improved in recent
decades, speech recognition accuracy still significantly degrades in noisy environments …

被引用次数：45 相关文章所有 4 个版本

[PDF] isca-archive.org

[PDF][PDF] Robust speech recognition using temporal masking and thresholding algorithm.

C Kim, KK Chin, M Bacchiani, RM Stern - Interspeech, 2014 - isca-archive.org

In this paper, we present a new dereverberation algorithm called Temporal Masking and
Thresholding (TMT) to enhance the temporal spectra of spectral features for robust speech …

被引用次数：23 相关文章所有 12 个版本

[PDF] psu.edu

Binaural sound source separation motivated by auditory processing

C Kim, K Kumar, RM Stern - 2011 IEEE International …, 2011 - ieeexplore.ieee.org

In this paper we present a new method of signal processing for robust speech recognition
using two microphones. The method, loosely based on the human binaural hearing system …

被引用次数：35 相关文章所有 11 个版本

Adjustable Coherent-to-Diffuse Power Estimator for Binaural Speech Enhancement in Multi-Talker Environments

R Ghanavi, CT Jin - IEEE/ACM Transactions on Audio, Speech …, 2024 - ieeexplore.ieee.org

The binaural coherence-to-diffuse power ratio (CDR) estimate in reverberant environments
is essential in many speech enhancement algorithms applied within hear-through systems …

[PDF] researchgate.net

Sound source separation using phase difference and reliable mask selection selection

C Kim, A Menon, M Bacchiani… - 2018 IEEE International …, 2018 - ieeexplore.ieee.org

In this paper, we present an algorithm called Reliable Mask Selection-Phase Difference
Channel Weighting (RMS-PDCW) which selects the target source masked by a noise source …

被引用次数：7 相关文章所有 5 个版本

[PDF] psu.edu

Multiple speaker tracking using a microphone array by combining auditory processing and a gaussian mixture cardinalized probability hypothesis density filter

A Plinge, D Hauschildt, MH Hennecke… - … on Acoustics, Speech …, 2011 - ieeexplore.ieee.org

Tracking speakers is an important application in smart environments. Acoustic tracking using
microphone arrays is a challenging task due to two major reasons: On the one hand …

被引用次数：12 相关文章所有 6 个版本

[PDF] google.com

[PDF][PDF] Sound source separation using phase difference and reliable mask selection

C Kim, A Menon, M Bacchiani… - … on Acoustics, Speech …, 2018 - research.google.com

In this paper, we present an algorithm called Reliable Mask Selection-Phase Difference
Channel Weighting (RMS-PDCW) which selects the target source masked by a noise source …

被引用次数：5 相关文章所有 7 个版本