所有版本 - 学术资源搜索

Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech

G Zhang, T Merritt, MS Ribeiro, B Tura-Vecino… - arXiv preprint arXiv …, 2023 - arxiv.org

Neural text-to-speech systems are often optimized on L1/L2 losses, which make strong
assumptions about the distributions of the target data space. Aiming to improve those …

被引用次数：3 相关文章

[HTML] amazon.science

[HTML][HTML] Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech

G Zhang, T Merritt, S Ribeiro, BT Vecino… - 2023 - amazon.science

Neural text-to-speech systems are often optimized on L1/L2 losses, which make strong
assumptions about the distributions of the target data space. Aiming to improve those …

[PDF] researchgate.net

[PDF][PDF] Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech

G Zhang, T Merritt, MS Ribeiro, B Tura-Vecino… - researchgate.net

Neural text-to-speech systems are often optimized on L1/L2 losses, which make strong
assumptions about the distributions of the target data space. Aiming to improve those …

Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech

G Zhang, T Merritt, MS Ribeiro, B Tura-Vecino… - arXiv e …, 2023 - ui.adsabs.harvard.edu

Neural text-to-speech systems are often optimized on L1/L2 losses, which make strong
assumptions about the distributions of the target data space. Aiming to improve those …

[PDF] isca-archive.org

[PDF][PDF] Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech

G Zhang, T Merritt, MS Ribeiro, B Tura-Vecino… - isca-archive.org

Neural text-to-speech systems are often optimized on L1/L2 losses, which make strong
assumptions about the distributions of the target data space. Aiming to improve those …