Regotron: Regularizing the Tacotron2 architecture via monotonic alignment loss

文章

学术资源搜索

获得 2 条结果（用时0.02秒）

我的图书馆

Regotron: Regularizing the Tacotron2 architecture via monotonic alignment loss

在引用文章中搜索

[PDF] arxiv.org

Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction

M Kim, M Jeong, BJ Choi, S Kim, JY Lee… - arXiv preprint arXiv …, 2024 - arxiv.org

We propose a novel text-to-speech (TTS) framework centered around a neural transducer.
Our approach divides the whole TTS pipeline into semantic-level sequence-to-sequence …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Small-E: Small Language Model with Linear Attention for Efficient Speech Synthesis

T Lemerle, N Obin, A Roebel - arXiv preprint arXiv:2406.04467, 2024 - arxiv.org

Recent advancements in text-to-speech (TTS) powered by language models have
showcased remarkable capabilities in achieving naturalness and zero-shot voice cloning …