Controllable Accented Text-to-Speech Synthesis With Fine and Coarse-Grained Intensity Rendering

R Liu, B Sisman, G Gao, H Li - IEEE/ACM Transactions on …, 2024 - ieeexplore.ieee.org
Accented text-to-speech (TTS) synthesis seeks to generate speech with an accent (L2) as a
variant of the standard version (L1), which is challenging as L2 is different from L1 in terms …

Controllable accented text-to-speech synthesis

R Liu, B Sisman, G Gao, H Li - arXiv preprint arXiv:2209.10804, 2022 - arxiv.org
Accented text-to-speech (TTS) synthesis seeks to generate speech with an accent (L2) as a
variant of the standard version (L1). Accented TTS synthesis is challenging as L2 is different …

Explicit intensity control for accented text-to-speech

R Liu, H Zuo, D Hu, G Gao, H Li - arXiv preprint arXiv:2210.15364, 2022 - arxiv.org
Accented text-to-speech (TTS) synthesis seeks to generate speech with an accent (L2) as a
variant of the standard version (L1). How to control the intensity of accent in the process of …

Accented text-to-speech synthesis with limited data

X Zhou, M Zhang, Y Zhou, Z Wu… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org
This paper presents an accented text-to-speech (TTS) synthesis framework with limited
training data. We study two aspects concerning accent rendering: phonetic (phoneme …

Accented text-to-speech synthesis with a conditional variational autoencoder

J Melechovsky, A Mehrish, B Sisman… - arXiv preprint arXiv …, 2022 - arxiv.org
Accent plays a significant role in speech communication, influencing understanding
capabilities and also conveying a person's identity. This paper introduces a novel and …

Grad-stylespeech: Any-speaker adaptive text-to-speech synthesis with diffusion models

M Kang, D Min, SJ Hwang - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
There has been a significant progress in Text-To-Speech (TTS) synthesis technology in
recent years, thanks to the advancement in neural generative modeling. However, existing …

Investigation of Japanese PnG BERT language model in text-to-speech synthesis for pitch accent language

Y Yasuda, T Toda - IEEE Journal of Selected Topics in Signal …, 2022 - ieeexplore.ieee.org
End-to-end text-to-speech synthesis (TTS) can generate highly natural synthetic speech
from raw text. However, rendering the correct pitch accents is still a challenging problem for …

Learning accent representation with multi-level vae towards controllable speech synthesis

J Melechovsky, A Mehrish… - 2022 IEEE Spoken …, 2023 - ieeexplore.ieee.org
Accent is a crucial aspect of speech that helps define one's identity. We note that the state-of-
the-art Text-to-Speech (TTS) systems can achieve high-quality generated voice, but still lack …

[PDF][PDF] RAD-MMM: Multilingual multiaccented multispeaker text to speech

R Badlani, R Valle, KJ Shih, JF Santos… - Proc …, 2023 - isca-archive.org
We create a multilingual speech synthesis system that can generate speech with a native
accent in any seen language while retaining the characteristics of an individual's voice. It is …

Styletts: A style-based generative model for natural and diverse text-to-speech synthesis

YA Li, C Han, N Mesgarani - arXiv preprint arXiv:2205.15439, 2022 - arxiv.org
Text-to-Speech (TTS) has recently seen great progress in synthesizing high-quality speech
owing to the rapid development of parallel TTS systems, but producing speech with …