Can we generate emotional pronunciations for expressive speech synthesis?
In the field of expressive speech synthesis, a lot of work has been conducted on
suprasegmental prosodic features while few has been done on pronunciation variants …
suprasegmental prosodic features while few has been done on pronunciation variants …
[PDF][PDF] Discriminative pronunciation modeling for dialectal speech recognition
Speech recognizers are typically trained with data from a standard dialect and do not
generalize to non-standard dialects. Mismatch mainly occurs in the acoustic realization of …
generalize to non-standard dialects. Mismatch mainly occurs in the acoustic realization of …
Deriving disyllabic word variants from a Chinese conversational speech corpus
YF Liu, SC Tseng, JSR Jang - The Journal of the Acoustical Society of …, 2016 - pubs.aip.org
Motivated by the quasi-categorical reduced forms of disyllabic words produced in Chinese
conversational speech, a frequency-based selection procedure of typical pronunciation by …
conversational speech, a frequency-based selection procedure of typical pronunciation by …
Probabilistic speaker pronunciation adaptation for spontaneous speech synthesis using linguistic features
Pronunciation adaptation consists in predicting pronunciation variants of words and
utterances based on their standard pronunciation and a target style. This is a key issue in …
utterances based on their standard pronunciation and a target style. This is a key issue in …
Improving TTS with corpus-specific pronunciation adaptation
Text-to-speech (TTS) systems are built on speech corpora which are labeled with carefully
checked and segmented phonemes. However, phoneme sequences generated by …
checked and segmented phonemes. However, phoneme sequences generated by …
Optimal feature set and minimal training size for pronunciation adaptation in TTS
Abstract Text-to-Speech (TTS) systems rely on a grapheme-to-phoneme converter which is
built to produce canonical, or statically stylized, pronunciations. Hence, the TTS quality …
built to produce canonical, or statically stylized, pronunciations. Hence, the TTS quality …
A knowledge-based system for stop consonant identification based on speech spectrogram reading
LF Lamel - Computer Speech & Language, 1993 - Elsevier
In order to formalize the information used in spectrogram reading, a knowledge-based
system for identifying spoken stop consonants was developed. Speech spectrogram reading …
system for identifying spoken stop consonants was developed. Speech spectrogram reading …
Traitement automatique de la parole expressive: retour vers des systèmes interprétables?
M Tahon - 2023 - hal.science
La parole est un moyen de communication fondamental qui s' inscrit dans une interaction
entre le locuteur et ses auditeurs. En plus du contenu sémantique, le signal de parole nous …
entre le locuteur et ses auditeurs. En plus du contenu sémantique, le signal de parole nous …
Statistical pronunciation adaptation for spontaneous speech synthesis
To bring more expressiveness into text-to-speech systems, this paper presents a new
pronunciation variant generation method which works by adapting standard, ie, dictionary …
pronunciation variant generation method which works by adapting standard, ie, dictionary …
Adaptation de la prononciation pour la synthèse de la parole spontanée en utilisant des informations linguistiques (Pronunciation adaptation for spontaneous speech …
Cet article présente une nouvelle méthode d'adaptation de la prononciation dont le but est
de reproduire le style spontané. Il s' agit d'une tâche-clé en synthèse de la parole car elle …
de reproduire le style spontané. Il s' agit d'une tâche-clé en synthèse de la parole car elle …