On the (UN) importance of the contextual factors in HMM-based speech synthesis and coding

A multi-level representation of f0 using the continuous wavelet transform and the discrete cosine transform

MS Ribeiro, RAJ Clark - 2015 IEEE International Conference …, 2015 - ieeexplore.ieee.org

We propose a representation of f0 using the Continuous Wavelet Transform (CWT) and the
Discrete Cosine Transform (DCT). The CWT decomposes the signal into various scales of …

被引用次数：40 相关文章所有 8 个版本

[PDF] epfl.ch

Phonological vocoding using artificial neural networks

M Cernak, B Potard, PN Garner - 2015 IEEE International …, 2015 - ieeexplore.ieee.org

We investigate a vocoder based on artificial neural networks using a phonological speech
representation. Speech decomposition is based on the phonological encoders, realised as …

被引用次数：27 相关文章所有 11 个版本

[PDF] researchgate.net

Decision tree usage for incremental parametric speech synthesis

T Baumann - 2014 IEEE International Conference on Acoustics …, 2014 - ieeexplore.ieee.org

Human speakers plan and deliver their utterances incrementally, piece-by-piece, and it is
obvious that their choice regarding phonetic details (and the details' peculiarities) is rarely …

被引用次数：25 相关文章所有 11 个版本

[PDF] ed.ac.uk

A perceptual investigation of wavelet-based decomposition of f0 for text-to-speech synthesis

MS Ribeiro, J Yamagishi… - INTERSPEECH 2015 16th …, 2015 - research.ed.ac.uk

Abstract The Continuous Wavelet Transform (CWT) has been recently proposed to model f0
in the context of speech synthesis. It was shown that systems using signal decomposition …

被引用次数：17 相关文章所有 8 个版本

[PDF] epfl.ch

Preliminary work on speaker adaptation for DNN-based speech synthesis

B Potard, P Motlicek, D Imseng - 2015 - infoscience.epfl.ch

We investigate speaker adaptation in the context of deep neural network (DNN) based
speech synthesis. More specifically, our current work focuses on the exploitation of auxiliary …

被引用次数：18 相关文章所有 6 个版本

[PDF] uni-hamburg.de

Partial representations improve the prosody of incremental speech synthesis

T Baumann - 2014 - edoc.sub.uni-hamburg.de

When humans speak, they do not plan their full utterance in all detail before beginning to
speak, nor do they speak piece-by-piece and ignoring their full message–instead humans …

被引用次数：13 相关文章所有 13 个版本

[PDF] ed.ac.uk

Learning word vector representations based on acoustic counts

MS Ribeiro, O Watts, J Yamagishi - Interspeech 2017, 2017 - research.ed.ac.uk

This paper presents a simple count-based approach to learning word vector representations
by leveraging statistics of cooccurrences between text and speech. This type of …

被引用次数：9 相关文章所有 8 个版本

[PDF] ed.ac.uk

[PDF][PDF] Syllable-Level Representations of Suprasegmental Features for DNN-Based Text-to-Speech Synthesis.

MS Ribeiro, O Watts, J Yamagishi - INTERSPEECH, 2016 - cstr.ed.ac.uk

A top-down hierarchical system based on deep neural networks is investigated for the
modeling of prosody in speech synthesis. Suprasegmental features are processed …

被引用次数：10 相关文章所有 8 个版本

A small-footprint context-independent HMM-based synthesizer for Tamil

G Anushiya Rachel, V Sherlin Solomi… - International Journal of …, 2015 - Springer

A text-to-speech synthesis system produces intelligible and natural speech corresponding to
any given text. Two main attributes of a synthesizer are the quality of speech produced and …

被引用次数：9 相关文章所有 4 个版本

[PDF] epfl.ch

Incremental syllable-context phonetic vocoding

M Cernak, PN Garner, A Lazaridis… - … /ACM Transactions on …, 2015 - ieeexplore.ieee.org

Current very low bit rate speech coders are, due to complexity limitations, designed to work
off-line. This paper investigates incremental speech coding that operates real-time and …

被引用次数：11 相关文章所有 7 个版本