Speaker normalization through formant-based warping of the frequency scale.

D O'shaughnessy - Proceedings of the IEEE, 2003 - ieeexplore.ieee.org

This paper examines how people communicate with computers using speech. Automatic
speech recognition (ASR) transforms speech into text, while automatic speech synthesis [or …

被引用次数：272 相关文章所有 4 个版本

[PDF] plos.org

On short-time estimation of vocal tract length from formant frequencies

AC Lammert, SS Narayanan - PloS one, 2015 - journals.plos.org

Vocal tract length is highly variable across speakers and determines many aspects of the
acoustic speech signal, making it an essential parameter to consider for explaining …

被引用次数：91 相关文章所有 13 个版本

[PDF] utdallas.edu

Generative modeling of pseudo-whisper for robust whispered speech recognition

S Ghaffarzadegan, H Bořil… - IEEE/ACM Transactions …, 2016 - ieeexplore.ieee.org

Whisper is a common means of communication used to avoid disturbing individuals or to
exchange private information. As a vocal style, whisper would be an ideal candidate for …

被引用次数：40 相关文章所有 5 个版本

[PDF] academia.edu

Frequency warping for VTLN and speaker adaptation by linear transformation of standard MFCC

S Panchapagesan, A Alwan - Computer speech & language, 2009 - Elsevier

Vocal tract length normalization (VTLN) for standard filterbank-based Mel frequency cepstral
coefficient (MFCC) features is usually implemented by warping the center frequencies of the …

被引用次数：55 相关文章所有 11 个版本

[PDF] ucla.edu

Adaptation of children's speech with limited data based on formant-like peak alignment

X Cui, A Alwan - Computer speech & language, 2006 - Elsevier

Automatic recognition of children's speech using acoustic models trained by adults results in
poor performance due to differences in speech acoustics. These acoustical differences are a …

被引用次数：41 相关文章所有 9 个版本

[PDF] hal.science

Flux-closure-domain states and demagnetizing energy determination in sub-micron size magnetic dots

PO Jubert, JC Toussaint, O Fruchart, C Meyer… - Europhysics …, 2003 - iopscience.iop.org

We used single-crystalline Fe dots self-assembled under UHV as a model system to discuss
micromagnetic properties of sub-micron size magnetic dots and show what properties may …

被引用次数：48 相关文章所有 14 个版本

[PDF] academia.edu

A low-complexity parabolic lip contour model with speaker normalization for high-level feature extraction in noise-robust audiovisual speech recognition

BJ Borgstrom, A Alwan - … Systems, Man, and Cybernetics-Part A …, 2008 - ieeexplore.ieee.org

This paper proposes a novel low-complexity lip contour model for high-level optic feature
extraction in noise-robust audiovisual (AV) automatic speech recognition systems. The …

被引用次数：29 相关文章所有 5 个版本

[PDF] d-nb.info

[PDF][PDF] Normalization in the acoustic feature space for improved speech recognition

S Molau - 2003 - d-nb.info

In this work, normalization techniques in the acoustic feature space are studied which
improve the robustness of automatic speech recognition systems. It is shown that there is a …

被引用次数：37 相关文章所有 2 个版本

[PDF] hal.science

Analyse et modèle génératif de l'expressivité: application à la parole et à l'interprétation musicale

G Beller - 2009 - theses.hal.science

1 Contexte actuel.......................... 2 1.2 Enjeu de la these.......................... 3 1.3 Proposition
centrale........................ 4 1.3. 1 Le paradoxe du comédien.................... 4 1.3. 2 …

被引用次数：27 相关文章所有 16 个版本

Vocal Tract Length Normalization using a Gaussian mixture model framework for query-by-example spoken term detection

MC Madhavi, HA Patil - Computer Speech & Language, 2019 - Elsevier

A speech spectrum is known to be changed by the variations in the length of the vocal tract
of a speaker. This is because of the fact that speech formants are inversely related to the …

被引用次数：11 相关文章所有 3 个版本