Robust speech recognition using probabilistic union models

MA Anusuya, SK Katti - arXiv preprint arXiv:1001.2267, 2010 - arxiv.org

This paper presents a brief survey on Automatic Speech Recognition and discusses the
major themes and advances made in the past 60 years of research, so as to provide a …

被引用次数：750 相关文章所有 2 个版本

[PDF] talkbank.org

An overview of speaker identification: Accuracy and robustness issues

R Togneri, D Pullella - IEEE circuits and systems magazine, 2011 - ieeexplore.ieee.org

This paper presents the main paradigms for speaker identification, and recent work on
missing data methods to increase robustness. The feature extraction, speaker modeling and …

被引用次数：400 相关文章所有 8 个版本

[PDF] mit.edu

Robust speaker recognition in noisy conditions

J Ming, TJ Hazen, JR Glass… - IEEE Transactions on …, 2007 - ieeexplore.ieee.org

This paper investigates the problem of speaker identification and verification in noisy
conditions, assuming that speech signals are corrupted by environmental noise, but …

被引用次数：341 相关文章所有 16 个版本

Interacting with computers by voice: automatic speech recognition and synthesis

D O'shaughnessy - Proceedings of the IEEE, 2003 - ieeexplore.ieee.org

This paper examines how people communicate with computers using speech. Automatic
speech recognition (ASR) transforms speech into text, while automatic speech synthesis [or …

被引用次数：271 相关文章所有 4 个版本

Combining temporal features by local binary pattern for acoustic scene classification

W Yang, S Krishnan - IEEE/ACM Transactions on Audio …, 2017 - ieeexplore.ieee.org

The popular frequency-domain features Mel-frequency cepstral coefficients (MFCCs) have
been widely used for the task of acoustic scene classification (ASC). The MFCC feature …

被引用次数：67 相关文章所有 3 个版本

[PDF] arxiv.org

Robust recognition of simultaneous speech by a mobile robot

JM Valin, S Yamamoto, J Rouat… - IEEE Transactions …, 2007 - ieeexplore.ieee.org

This paper describes a system that gives a mobile robot the ability to perform automatic
speech recognition with simultaneous speakers. A microphone array is used along with a …

被引用次数：109 相关文章所有 23 个版本

[PDF] isca-archive.org

A corpus-based approach to speech enhancement from nonstationary noise

J Ming, R Srinivasan, D Crookes - IEEE Transactions on Audio …, 2010 - ieeexplore.ieee.org

Temporal dynamics and speaker characteristics are two important features of speech that
distinguish speech from noise. In this paper, we propose a method to maximally extract …

被引用次数：74 相关文章所有 10 个版本

[PDF] arxiv.org

Parallel synthesis for autoregressive speech generation

P Hsu, D Liu, AT Liu, H Lee - IEEE/ACM Transactions on Audio …, 2023 - ieeexplore.ieee.org

Autoregressive neural vocoders have achieved outstanding performance in speech
synthesis tasks such as text-to-speech and voice conversion. An autoregressive vocoder …

被引用次数：4 相关文章所有 4 个版本

[PDF] researchgate.net

Subband correlation and robust speech recognition

J McAuley, J Ming, D Stewart… - IEEE Transactions on …, 2005 - ieeexplore.ieee.org

This paper investigates the effect of modeling subband correlation for noisy speech
recognition. Subband feature streams are assumed to be independent in many subband …

被引用次数：46 相关文章所有 4 个版本

[PDF] hal.science

Combining missing-feature theory, speech enhancement, and speaker-dependent/-independent modeling for speech separation

J Ming, TJ Hazen, JR Glass - Computer Speech & Language, 2010 - Elsevier

This paper considers the separation and recognition of overlapped speech sentences
assuming single-channel observation. A system based on a combination of several different …

被引用次数：40 相关文章所有 20 个版本