Interacting with computers by voice: automatic speech recognition and synthesis
D O'shaughnessy - Proceedings of the IEEE, 2003 - ieeexplore.ieee.org
This paper examines how people communicate with computers using speech. Automatic
speech recognition (ASR) transforms speech into text, while automatic speech synthesis [or …
speech recognition (ASR) transforms speech into text, while automatic speech synthesis [or …
On short-time estimation of vocal tract length from formant frequencies
AC Lammert, SS Narayanan - PloS one, 2015 - journals.plos.org
Vocal tract length is highly variable across speakers and determines many aspects of the
acoustic speech signal, making it an essential parameter to consider for explaining …
acoustic speech signal, making it an essential parameter to consider for explaining …
Generative modeling of pseudo-whisper for robust whispered speech recognition
S Ghaffarzadegan, H Bořil… - IEEE/ACM Transactions …, 2016 - ieeexplore.ieee.org
Whisper is a common means of communication used to avoid disturbing individuals or to
exchange private information. As a vocal style, whisper would be an ideal candidate for …
exchange private information. As a vocal style, whisper would be an ideal candidate for …
Frequency warping for VTLN and speaker adaptation by linear transformation of standard MFCC
S Panchapagesan, A Alwan - Computer speech & language, 2009 - Elsevier
Vocal tract length normalization (VTLN) for standard filterbank-based Mel frequency cepstral
coefficient (MFCC) features is usually implemented by warping the center frequencies of the …
coefficient (MFCC) features is usually implemented by warping the center frequencies of the …
Adaptation of children's speech with limited data based on formant-like peak alignment
Automatic recognition of children's speech using acoustic models trained by adults results in
poor performance due to differences in speech acoustics. These acoustical differences are a …
poor performance due to differences in speech acoustics. These acoustical differences are a …
Flux-closure-domain states and demagnetizing energy determination in sub-micron size magnetic dots
PO Jubert, JC Toussaint, O Fruchart, C Meyer… - Europhysics …, 2003 - iopscience.iop.org
We used single-crystalline Fe dots self-assembled under UHV as a model system to discuss
micromagnetic properties of sub-micron size magnetic dots and show what properties may …
micromagnetic properties of sub-micron size magnetic dots and show what properties may …
A low-complexity parabolic lip contour model with speaker normalization for high-level feature extraction in noise-robust audiovisual speech recognition
BJ Borgstrom, A Alwan - … Systems, Man, and Cybernetics-Part A …, 2008 - ieeexplore.ieee.org
This paper proposes a novel low-complexity lip contour model for high-level optic feature
extraction in noise-robust audiovisual (AV) automatic speech recognition systems. The …
extraction in noise-robust audiovisual (AV) automatic speech recognition systems. The …
[PDF][PDF] Normalization in the acoustic feature space for improved speech recognition
S Molau - 2003 - d-nb.info
In this work, normalization techniques in the acoustic feature space are studied which
improve the robustness of automatic speech recognition systems. It is shown that there is a …
improve the robustness of automatic speech recognition systems. It is shown that there is a …
Analyse et modèle génératif de l'expressivité: application à la parole et à l'interprétation musicale
G Beller - 2009 - theses.hal.science
1 Contexte actuel.......................... 2 1.2 Enjeu de la these.......................... 3 1.3 Proposition
centrale........................ 4 1.3. 1 Le paradoxe du comédien.................... 4 1.3. 2 …
centrale........................ 4 1.3. 1 Le paradoxe du comédien.................... 4 1.3. 2 …
Vocal Tract Length Normalization using a Gaussian mixture model framework for query-by-example spoken term detection
MC Madhavi, HA Patil - Computer Speech & Language, 2019 - Elsevier
A speech spectrum is known to be changed by the variations in the length of the vocal tract
of a speaker. This is because of the fact that speech formants are inversely related to the …
of a speaker. This is because of the fact that speech formants are inversely related to the …