Metts: Multilingual emotional text-to-speech by cross-speaker and cross-lingual emotion transfer
Previous multilingual text-to-speech (TTS) approaches have considered leveraging
monolingual speaker data to enable cross-lingual speech synthesis. However, such data …
monolingual speaker data to enable cross-lingual speech synthesis. However, such data …
Cross-lingual prosody transfer for expressive machine dubbing
Prosody transfer is well-studied in the context of expressive speech synthesis. Cross-lingual
prosody transfer, however, is challenging and has been under-explored to date. In this …
prosody transfer, however, is challenging and has been under-explored to date. In this …
Expressive machine dubbing through phrase-level cross-lingual prosody transfer
J Swiatkowski, D Wang, M Babianski, G Coccia… - arXiv preprint arXiv …, 2023 - arxiv.org
Speech generation for machine dubbing adds complexity to conventional Text-To-Speech
solutions as the generated output is required to match the expressiveness, emotion and …
solutions as the generated output is required to match the expressiveness, emotion and …
[PDF][PDF] Variational Inference Applications in Deep Learning
J Świątkowski - repozytorium.uw.edu.pl
This PhD thesis explores applications of variational inference (Peterson, 1987; Hinton & Van
Camp, 1993a) techniques in the domain of deep learning (Goodfellow et al., 2016) …
Camp, 1993a) techniques in the domain of deep learning (Goodfellow et al., 2016) …