Conventional and contemporary approaches used in text to speech synthesis: A review
N Kaur, P Singh - Artificial Intelligence Review, 2023 - Springer
Nowadays speech synthesis or text to speech (TTS), an ability of system to produce human
like natural sounding voice from the written text, is gaining popularity in the field of speech …
like natural sounding voice from the written text, is gaining popularity in the field of speech …
A review of deep learning based speech synthesis
Speech synthesis, also known as text-to-speech (TTS), has attracted increasingly more
attention. Recent advances on speech synthesis are overwhelmingly contributed by deep …
attention. Recent advances on speech synthesis are overwhelmingly contributed by deep …
A survey on neural speech synthesis
Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural
speech given text, is a hot research topic in speech, language, and machine learning …
speech given text, is a hot research topic in speech, language, and machine learning …
Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis
This paper proposes a hierarchical, fine-grained and interpretable latent variable model for
prosody based on the Tacotron 2 text-to-speech model. It achieves multi-resolution …
prosody based on the Tacotron 2 text-to-speech model. It achieves multi-resolution …
Promptstyle: Controllable style transfer for text-to-speech with natural language descriptions
Style transfer TTS has shown impressive performance in recent years. However, style
control is often restricted to systems built on expressive speech recordings with discrete style …
control is often restricted to systems built on expressive speech recordings with discrete style …
Msemotts: Multi-scale emotion transfer, prediction, and control for emotional speech synthesis
Expressive synthetic speech is essential for many human-computer interaction and audio
broadcast scenarios, and thus synthesizing expressive speech has attracted much attention …
broadcast scenarios, and thus synthesizing expressive speech has attracted much attention …
Controllable emotion transfer for end-to-end speech synthesis
Emotion embedding space learned from references is a straight-forward approach for
emotion transfer in encoder-decoder structured emotional text to speech (TTS) systems …
emotion transfer in encoder-decoder structured emotional text to speech (TTS) systems …
Emotion intensity and its control for emotional voice conversion
Emotional voice conversion (EVC) seeks to convert the emotional state of an utterance while
preserving the linguistic content and speaker identity. In EVC, emotions are usually treated …
preserving the linguistic content and speaker identity. In EVC, emotions are usually treated …
Speech synthesis with mixed emotions
Emotional speech synthesis aims to synthesize human voices with various emotional effects.
The current studies are mostly focused on imitating an averaged style belonging to a specific …
The current studies are mostly focused on imitating an averaged style belonging to a specific …
Emotional speech synthesis with rich and granularized control
This paper proposes an effective emotion control method for an end-to-end text-to-speech
(TTS) system. To flexibly control the distinct characteristic of a target emotion category, it is …
(TTS) system. To flexibly control the distinct characteristic of a target emotion category, it is …