Exploiting morphological and phonological features to improve prosodic phrasing for mongolian speech synthesis
Prosodic phrasing is an important factor that affects naturalness and intelligibility in text-to-
speech synthesis. Studies show that deep learning techniques improve prosodic phrasing …
speech synthesis. Studies show that deep learning techniques improve prosodic phrasing …
Modeling prosodic phrasing with multi-task learning in tacotron-based TTS
Tacotron-based end-to-end speech synthesis has shown remarkable voice quality.
However, the rendering of prosody in the synthesized speech remains to be improved …
However, the rendering of prosody in the synthesized speech remains to be improved …
Deep learning for prominence detection in children's read speech
The detection of perceived prominence in speech has attracted approaches ranging from
the design of knowledge-based linguistic and acoustic features to the automatic feature …
the design of knowledge-based linguistic and acoustic features to the automatic feature …
Automatic Prosody Annotation with Pre-Trained Text-Speech Model
Prosodic boundary plays an important role in text-to-speech synthesis (TTS) in terms of
naturalness and readability. However, the acquisition of prosodic boundary labels relies on …
naturalness and readability. However, the acquisition of prosodic boundary labels relies on …
[PDF][PDF] Multi-Modal Automatic Prosody Annotation with Contrastive Pretraining of Speech-Silence and Word-Punctuation
J Zhong, Y Li, H Huang, K Richmond, J Liu… - Proc. Interspeech …, 2024 - isca-archive.org
In expressive and controllable Text-to-Speech (TTS), explicit prosodic features significantly
improve the naturalness and controllability of synthesised speech. However, manual …
improve the naturalness and controllability of synthesised speech. However, manual …
Pitchtron: Towards audiobook generation from ordinary people's voices
In this paper, we explore prosody transfer for audiobook generation under rather realistic
condition where training DB is plain audio mostly from multiple ordinary people and …
condition where training DB is plain audio mostly from multiple ordinary people and …
[PDF][PDF] Incorporating prosodic events in text-to-speech synthesis
R Sloan, A Adigwe, S Mohandoss… - Proceedings of the …, 2022 - isca-archive.org
While producing accurate prosody can significantly improve the naturalness and
comprehensibility of synthesized speech, many Text-to-Speech (TTS) systems still do not …
comprehensibility of synthesized speech, many Text-to-Speech (TTS) systems still do not …
Multi-Modal Automatic Prosody Annotation with Contrastive Pretraining of SSWP
J Zhong, Y Li, H Huang, K Richmond, J Liu… - arXiv preprint arXiv …, 2023 - arxiv.org
In expressive and controllable Text-to-Speech (TTS), explicit prosodic features significantly
improve the naturalness and controllability of synthesised speech. However, manual …
improve the naturalness and controllability of synthesised speech. However, manual …
Korean Prosody Phrase Boundary Prediction Model for Speech Synthesis Service in Smart Healthcare
Speech processing technology has great potential in the medical field to provide beneficial
solutions for both patients and doctors. Speech interfaces, represented by speech synthesis …
solutions for both patients and doctors. Speech interfaces, represented by speech synthesis …
The effect of different information sources on prosodic boundary perception
This study aims to quantify the effect of several information sources: acoustic, higher-level
linguistic, and knowledge of the prosodic system of the language, on the perception of …
linguistic, and knowledge of the prosodic system of the language, on the perception of …