World: a vocoder-based high-quality speech synthesis system for real-time applications

M Morise, F Yokomori, K Ozawa - IEICE TRANSACTIONS on …, 2016 - search.ieice.org
A vocoder-based speech synthesis system, named WORLD, was developed in an effort to
improve the sound quality of real-time applications using speech. Speech analysis …

[HTML][HTML] D4C, a band-aperiodicity estimator for high-quality speech synthesis

M Morise - Speech Communication, 2016 - Elsevier
An algorithm is proposed for estimating the band aperiodicity of speech signals, where
“aperiodicity” is defined as the power ratio between the speech signal and the aperiodic …

[HTML][HTML] CheapTrick, a spectral envelope estimator for high-quality speech synthesis

M Morise - Speech Communication, 2015 - Elsevier
A spectral envelope estimation algorithm is presented to achieve high-quality speech
synthesis. The concept of the algorithm is to obtain an accurate and temporally stable …

[PDF][PDF] Harvest: A High-Performance Fundamental Frequency Estimator from Speech Signals.

M Morise - INTERSPEECH, 2017 - isca-archive.org
A fundamental frequency (F0) estimator named Harvest is described. The unique points of
Harvest are that it can obtain a reliable F0 contour and reduce the error that the voiced …

[PDF][PDF] Statistical singing voice conversion with direct waveform modification based on the spectrum differential

K Kobayashi, T Toda, G Neubig, S Sakti… - … Annual Conference of …, 2014 - isca-archive.org
This paper presents a novel statistical singing voice conversion (SVC) technique with direct
waveform modification based on the spectrum differential that can convert voice timbre of a …

Error evaluation of an F0-adaptive spectral envelope estimator in robustness against the additive noise and F0 error

M Morise - IEICE transactions on information and systems, 2015 - search.ieice.org
This paper describes an evaluation of a temporally stable spectral envelope estimator
proposed in our past research. The past research demonstrated that the proposed algorithm …

Singing information processing

M Goto - 2014 12th International Conference on Signal …, 2014 - ieeexplore.ieee.org
This paper introduces singing information processing, which is defined as music information
processing for singing voices. As many people listen to music with a focus on singing …

Voice timbre control based on perceived age in singing voice conversion

K Kobayashi, T Toda, H Doi, T Nakano… - … on Information and …, 2014 - search.ieice.org
The perceived age of a singing voice is the age of the singer as perceived by the listener,
and is one of the notable characteristics that determines perceptions of a song. In this paper …

Human-in-the-loop speech-design system and its evaluation

D Kondo, M Morise - 2019 Asia-Pacific Signal and Information …, 2019 - ieeexplore.ieee.org
We propose human-in-the-loop (HITL) speech-design system with an interface. General text-
to-speech (TTS) systems generate the speech waveform from the input text without the need …

Implementation of sequential real-time waveform generator for high-quality vocoder

M Morise - 2020 Asia-Pacific Signal and Information Processing …, 2020 - ieeexplore.ieee.org
We describe an implementation of real-time waveform generation from vocoded speech
parameters. High-quality vocoders such as STRAIGHT and WORLD have been used for …