Speech recognition by machine, a review

MA Anusuya, SK Katti - arXiv preprint arXiv:1001.2267, 2010 - arxiv.org
This paper presents a brief survey on Automatic Speech Recognition and discusses the
major themes and advances made in the past 60 years of research, so as to provide a …

An overview of speaker identification: Accuracy and robustness issues

R Togneri, D Pullella - IEEE circuits and systems magazine, 2011 - ieeexplore.ieee.org
This paper presents the main paradigms for speaker identification, and recent work on
missing data methods to increase robustness. The feature extraction, speaker modeling and …

Robust speaker recognition in noisy conditions

J Ming, TJ Hazen, JR Glass… - IEEE Transactions on …, 2007 - ieeexplore.ieee.org
This paper investigates the problem of speaker identification and verification in noisy
conditions, assuming that speech signals are corrupted by environmental noise, but …

Interacting with computers by voice: automatic speech recognition and synthesis

D O'shaughnessy - Proceedings of the IEEE, 2003 - ieeexplore.ieee.org
This paper examines how people communicate with computers using speech. Automatic
speech recognition (ASR) transforms speech into text, while automatic speech synthesis [or …

Combining temporal features by local binary pattern for acoustic scene classification

W Yang, S Krishnan - IEEE/ACM Transactions on Audio …, 2017 - ieeexplore.ieee.org
The popular frequency-domain features Mel-frequency cepstral coefficients (MFCCs) have
been widely used for the task of acoustic scene classification (ASC). The MFCC feature …

Robust recognition of simultaneous speech by a mobile robot

JM Valin, S Yamamoto, J Rouat… - IEEE Transactions …, 2007 - ieeexplore.ieee.org
This paper describes a system that gives a mobile robot the ability to perform automatic
speech recognition with simultaneous speakers. A microphone array is used along with a …

A corpus-based approach to speech enhancement from nonstationary noise

J Ming, R Srinivasan, D Crookes - IEEE Transactions on Audio …, 2010 - ieeexplore.ieee.org
Temporal dynamics and speaker characteristics are two important features of speech that
distinguish speech from noise. In this paper, we propose a method to maximally extract …

Parallel synthesis for autoregressive speech generation

P Hsu, D Liu, AT Liu, H Lee - IEEE/ACM Transactions on Audio …, 2023 - ieeexplore.ieee.org
Autoregressive neural vocoders have achieved outstanding performance in speech
synthesis tasks such as text-to-speech and voice conversion. An autoregressive vocoder …

Subband correlation and robust speech recognition

J McAuley, J Ming, D Stewart… - IEEE Transactions on …, 2005 - ieeexplore.ieee.org
This paper investigates the effect of modeling subband correlation for noisy speech
recognition. Subband feature streams are assumed to be independent in many subband …

Combining missing-feature theory, speech enhancement, and speaker-dependent/-independent modeling for speech separation

J Ming, TJ Hazen, JR Glass - Computer Speech & Language, 2010 - Elsevier
This paper considers the separation and recognition of overlapped speech sentences
assuming single-channel observation. A system based on a combination of several different …