查看文章

epfl.ch 中的 [PDF]

Phonological vocoding using artificial neural networks

作者

Milos Cernak, Blaise Potard, Philip N Garner

发表日期

2015/4/19

研讨会论文

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

页码范围

4844-4848

出版商

IEEE

简介

We investigate a vocoder based on artificial neural networks using a phonological speech representation. Speech decomposition is based on the phonological encoders, realised as neural network classifiers, that are trained for a particular language. The speech reconstruction process involves using a Deep Neural Network (DNN) to map phonological features posteriors to speech parameters - line spectra and glottal signal parameters - followed by LPC resynthesis. This DNN is trained on a target voice without transcriptions, in a semi-supervised manner. Both encoder and decoder are based on neural networks and thus the vocoding is achieved using a simple fast forward pass. An experiment with French vocoding and a target male voice trained on 21 hour long audio book is presented. An application of the phonological vocoder to low bit rate speech coding is shown, where transmitted phonological posteriors …

引用总数

被引用次数：27

201520162017201820192020202120221 7 3 8 2 5 1

学术搜索中的文章

Phonological vocoding using artificial neural networks

M Cernak, B Potard, PN Garner - 2015 IEEE International Conference on Acoustics …, 2015

被引用次数：27 相关文章所有 11 个版本