Extracting and predicting word-level style variations for speech synthesis

Y Lei, S Yang, X Wang, L Xie - IEEE/ACM Transactions on …, 2022 - ieeexplore.ieee.org

Expressive synthetic speech is essential for many human-computer interaction and audio
broadcast scenarios, and thus synthesizing expressive speech has attracted much attention …

被引用次数：71 相关文章所有 4 个版本

[PDF] springer.com

Deep learning-based expressive speech synthesis: a systematic review of approaches, challenges, and resources

H Barakat, O Turk, C Demiroglu - EURASIP Journal on Audio, Speech, and …, 2024 - Springer

Speech synthesis has made significant strides thanks to the transition from machine learning
to deep learning models. Contemporary text-to-speech (TTS) models possess the capability …

被引用次数：6 相关文章所有 6 个版本

[PDF] arxiv.org

MSStyleTTS: Multi-scale style modeling with hierarchical context information for expressive speech synthesis

S Lei, Y Zhou, L Chen, Z Wu, X Wu… - … /ACM Transactions on …, 2023 - ieeexplore.ieee.org

Expressive speech synthesis is crucial for many human-computer interaction scenarios,
such as audiobooks, podcasts, and voice assistants. Previous works focus on predicting the …

被引用次数：7 相关文章所有 5 个版本

[PDF] wiley.com Full View

Dynamic Invariant‐Specific Representation Fusion Network for Multimodal Sentiment Analysis

J He, H Yanga, C Zhang, H Chen… - Computational …, 2022 - Wiley Online Library

Multimodal sentiment analysis (MSA) aims to infer emotions from linguistic, auditory, and
visual sequences. Multimodal information representation method and fusion technology are …

被引用次数：23 相关文章所有 13 个版本

[PDF] arxiv.org

Towards expressive speaking style modelling with hierarchical context information for mandarin speech synthesis

S Lei, Y Zhou, L Chen, Z Wu, S Kang… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

Previous works on expressive speech synthesis mainly focus on current sentence. The
context in adjacent sentences is neglected, resulting in inflexible speaking style for the same …

被引用次数：18 相关文章所有 9 个版本

Prosody modelling with pre-trained cross-utterance representations for improved speech synthesis

YJ Zhang, C Zhang, W Song, Z Zhang… - … /ACM Transactions on …, 2023 - ieeexplore.ieee.org

When humans speak multiple utterances in a continuous manner, the prosodic features
generated in each utterance are related to those in its neighbouring utterances. Such cross …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

MSM-VC: high-fidelity source style transfer for non-parallel voice conversion by multi-scale style modeling

Z Wang, X Wang, Q Xie, T Li, L Xie… - … /ACM Transactions on …, 2023 - ieeexplore.ieee.org

In addition to conveying the linguistic content from source speech to converted speech,
maintaining the speaking style of source speech also plays an important role in the voice …

被引用次数：3 相关文章所有 4 个版本

[PDF] aclanthology.org

Unsupervised multi-scale expressive speaking style modeling with hierarchical context information for audiobook speech synthesis

X Chen, S Lei, Z Wu, D Xu, W Zhao… - Proceedings of the 29th …, 2022 - aclanthology.org

Naturalness and expressiveness are crucial for audiobook speech synthesis, but now are
limited by the averaged global-scale speaking style representation. In this paper, we …

被引用次数：8 相关文章

[PDF] arxiv.org

Context-aware coherent speaking style prediction with hierarchical transformers for audiobook speech synthesis

S Lei, Y Zhou, L Chen, Z Wu, S Kang… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

Recent advances in text-to-speech have significantly improved the expressiveness of
synthesized speech. However, it is still challenging to generate speech with contextually …

被引用次数：5 相关文章所有 4 个版本

[PDF] isca-archive.org

[PDF][PDF] Integrating Discrete Word-Level Style Variations into Non-Autoregressive Acoustic Models for Speech Synthesis.

Z Liu, NQ Wu, Y Zhang, Z Ling - INTERSPEECH, 2022 - isca-archive.org

This paper presents a method of integrating word-level style variations (WSVs) into non-
autoregressive acoustic models for speech synthesis. WSVs are discrete latent …

被引用次数：5 相关文章所有 3 个版本