查看文章

bme.hu 中的 [PDF]

Modeling unvoiced sounds in statistical parametric speech synthesis with a continuous vocoder

作者

Tamás Gábor Csapó, Géza Németh, Milos Cernak, Philip N Garner

发表日期

2016/8/29

研讨会论文

2016 24th European Signal Processing Conference (EUSIPCO)

页码范围

1338-1342

出版商

IEEE

简介

In this paper, we introduce an improved excitation model for statistical parametric speech synthesis. Our earlier vocoder [1], which applies continuous F0 in combination with Maximum Voiced Frequency (MVF), is extended. The focus of this paper is on the modeling of unvoiced consonants, for which two alternative methods are proposed. The first method applies no postprocessing during MVF estimation to reduce the unwanted voiced component of unvoiced speech sounds. The second separates voiced and unvoiced excitation based on the phonetic labels of the text to be synthesized. In an objective experiment we found that the first method produces unvoiced sounds that are closer to natural speech in terms of Harmonics-to-Noise Ratio. A subjective listening test showed that both methods are more natural than our baseline system, and the second method is significantly preferred.

引用总数

被引用次数：27

201620172018201920202021202220231 10 2 2 4 2 3 3

学术搜索中的文章

Modeling unvoiced sounds in statistical parametric speech synthesis with a continuous vocoder

TG Csapó, G Németh, M Cernak, PN Garner - 2016 24th European Signal Processing Conference …, 2016

被引用次数：27 相关文章所有 10 个版本