Voice conversion through vector quantization
M Abe, S Nakamura, K Shikano… - Journal of the Acoustical …, 1990 - jstage.jst.go.jp
M Abe, S Nakamura, K Shikano, H Kuwabara
Journal of the Acoustical Society of Japan (E), 1990•jstage.jst.go.jp1. INTRODUCTION In daily communication, voice individuality is one of the most important
aspects of human speech. It is especially important for identifying other person in a
telephone conversation. A technique to control speech individuality, therefore, has an
important role and offers many applications. Our present study is concerned with converting
voice quality from one speaker to another and developing a technique which enables us to
give individuality to synthesized speech. One system goal we can imagine is shown in Fig …
aspects of human speech. It is especially important for identifying other person in a
telephone conversation. A technique to control speech individuality, therefore, has an
important role and offers many applications. Our present study is concerned with converting
voice quality from one speaker to another and developing a technique which enables us to
give individuality to synthesized speech. One system goal we can imagine is shown in Fig …
1. INTRODUCTION In daily communication, voice individuality is one of the most important aspects of human speech. It is especially important for identifying other person in a telephone conversation. A technique to control speech individuality, therefore, has an important role and offers many applications. Our present study is concerned with converting voice quality from one speaker to another and developing a technique which enables us to give individuality to synthesized speech. One system goal we can imagine is shown in Fig. 1. Using a voice conversion system as a post-proces sorfor a synthesis-by-rule system, various kinds of speech, such as a particular person's voice, a child likevoice, husky voice, etc., can be synthesized. For voice conversion, as shown in Fig. 1, it is neces saryto have a database of voice individuality. Speech individuality generally consists of two major factors: acoustic features and prosodic fea tures. As the first step in this research, we are trying to control the acoustic features. 1, 2) According to previous studies, the acoustic features that con tributeto speech individuality are distributed among various parameters, such as formant frequencies, formant bandwidths, spectral tilt, and glottal wave forms. 3, 4) Because speech individuality is deter minedby all of these, it is difficult to control it by modifying each parameter independently. On the other hand, codebooks used in vector quantization represent all of these parameters altogether. There fore, speech individuality of a speaker is represented by the code-vectors in a codebook of the speaker. A conversion of acoustic features from one speaker to another is reduced to the problem of finding a correspondence between the codebooks of the two speakers. The basic problem is, therefore, to find mapping function from one codebook to another, which we will call a'mapping codebook.' This is the basic idea of this conversion technique. In our proposed technique, the mapping codebooks which represent the correspondence between different speakers' codebooks provide the database of speech individuality in Fig. 1.
jstage.jst.go.jp
以上显示的是最相近的搜索结果。 查看全部搜索结果