Incremental text-to-speech synthesis with prefix-to-prefix framework

X Tan, T Qin, F Soong, TY Liu - arXiv preprint arXiv:2106.15561, 2021 - arxiv.org

Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural
speech given text, is a hot research topic in speech, language, and machine learning …

被引用次数：366 相关文章所有 2 个版本

[PDF] neurips.cc

Glow-tts: A generative flow for text-to-speech via monotonic alignment search

J Kim, S Kim, J Kong, S Yoon - Advances in Neural …, 2020 - proceedings.neurips.cc

Abstract Recently, text-to-speech (TTS) models such as FastSpeech and ParaNet have been
proposed to generate mel-spectrograms from text in parallel. Despite the advantage, the …

被引用次数：452 相关文章所有 5 个版本

[PDF] arxiv.org

Review of end-to-end speech synthesis technology based on deep learning

Z Mu, X Yang, Y Dong - arXiv preprint arXiv:2104.09995, 2021 - arxiv.org

As an indispensable part of modern human-computer interaction system, speech synthesis
technology helps users get the output of intelligent machine more easily and intuitively, thus …

被引用次数：34 相关文章所有 2 个版本

[PDF] neurips.cc

Speech-t: Transducer for text to speech and beyond

J Chen, X Tan, Y Leng, J Xu, G Wen… - Advances in Neural …, 2021 - proceedings.neurips.cc

Abstract Neural Transducer (eg, RNN-T) has been widely used in automatic speech
recognition (ASR) due to its capabilities of efficiently modeling monotonic alignments …

被引用次数：16 相关文章所有 5 个版本

[PDF] arxiv.org

Tdass: Target domain adaptation speech synthesis framework for multi-speaker low-resource tts

X Zhang, J Wang, N Cheng… - 2022 International Joint …, 2022 - ieeexplore.ieee.org

Recently, synthesizing personalized speech by text-to-speech (TTS) application is highly
demanded. But the previous TTS models require a mass of target speaker speeches for …

被引用次数：15 相关文章所有 4 个版本

[PDF] ieee.org

Incremental text-to-speech synthesis using pseudo lookahead with large pretrained language model

T Saeki, S Takamichi… - IEEE Signal Processing …, 2021 - ieeexplore.ieee.org

This letter presents an incremental text-to-speech (TTS) method that performs synthesis in
small linguistic units while maintaining the naturalness of output speech. Incremental TTS is …

被引用次数：19 相关文章所有 7 个版本

[PDF] arxiv.org

What the future brings: Investigating the impact of lookahead for incremental neural TTS

B Stephenson, L Besacier, L Girin, T Hueber - arXiv preprint arXiv …, 2020 - arxiv.org

In incremental text to speech synthesis (iTTS), the synthesizer produces an audio output
before it has access to the entire input sentence. In this paper, we study the behavior of a …

被引用次数：20 相关文章所有 13 个版本

[PDF] ieee.org

A machine speech chain approach for dynamically adaptive lombard tts in static and dynamic noise environments

S Novitasari, S Sakti… - IEEE/ACM Transactions on …, 2022 - ieeexplore.ieee.org

Recent end-to-end text-to-speech synthesis (TTS) systems have successfully synthesized
high-quality speech. However, TTS speech intelligibility degrades in noisy environments …

被引用次数：6 相关文章所有 6 个版本

[PDF] arxiv.org

Speak While You Think: Streaming Speech Synthesis During Text Generation

A Dekel, S Shechtman, R Fernandez… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

Large Language Models (LLMs) demonstrate impressive capabilities, yet interaction with
these models is mostly facilitated through text. Using Text-To-Speech to synthesize LLM …

被引用次数：2 相关文章所有 4 个版本

[PDF] arxiv.org

Incremental text to speech for neural sequence-to-sequence models using reinforcement learning

DSR Mohan, R Lenain, L Foglianti, TH Teh… - arXiv preprint arXiv …, 2020 - arxiv.org

Modern approaches to text to speech require the entire input character sequence to be
processed before any audio is synthesised. This latency limits the suitability of such models …

被引用次数：16 相关文章所有 8 个版本