Fast wavenet generation algorithm

X Tan, T Qin, F Soong, TY Liu - arXiv preprint arXiv:2106.15561, 2021 - arxiv.org

Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural
speech given text, is a hot research topic in speech, language, and machine learning …

被引用次数：392 相关文章所有 2 个版本

[PDF] mlr.press

It's raw! audio generation with state-space models

K Goel, A Gu, C Donahue, C Ré - … Conference on Machine …, 2022 - proceedings.mlr.press

Developing architectures suitable for modeling raw audio is a challenging problem due to
the high sampling rates of audio waveforms. Standard sequence modeling approaches like …

被引用次数：165 相关文章所有 4 个版本

[PDF] arxiv.org

Gansynth: Adversarial neural audio synthesis

J Engel, KK Agrawal, S Chen, I Gulrajani… - arXiv preprint arXiv …, 2019 - arxiv.org

Efficient audio synthesis is an inherently difficult machine learning task, as human
perception is sensitive to both global structure and fine-scale waveform coherence …

被引用次数：535 相关文章所有 7 个版本

[PDF] mlr.press

Parallel wavenet: Fast high-fidelity speech synthesis

A Oord, Y Li, I Babuschkin… - International …, 2018 - proceedings.mlr.press

The recently-developed WaveNet architecture is the current state of the art in realistic
speech synthesis, consistently rated as more natural sounding for many different languages …

被引用次数：979 相关文章所有 7 个版本

[PDF] mlr.press

Deep voice: Real-time neural text-to-speech

SÖ Arık, M Chrzanowski, A Coates… - International …, 2017 - proceedings.mlr.press

Abstract We present Deep Voice, a production-quality text-to-speech system constructed
entirely from deep neural networks. Deep Voice lays the groundwork for truly end-to-end …

被引用次数：818 相关文章所有 5 个版本

[PDF] arxiv.org

MidiNet: A convolutional generative adversarial network for symbolic-domain music generation

LC Yang, SY Chou, YH Yang - arXiv preprint arXiv:1703.10847, 2017 - arxiv.org

Most existing neural network models for music generation use recurrent neural networks.
However, the recent WaveNet model proposed by DeepMind shows that convolutional …

被引用次数：649 相关文章所有 4 个版本

[PDF] worktribe.com

GluNet: A deep learning framework for accurate glucose forecasting

K Li, C Liu, T Zhu, P Herrero… - IEEE journal of …, 2019 - ieeexplore.ieee.org

For people with Type 1 diabetes (T1D), forecasting of blood glucose (BG) can be used to
effectively avoid hyperglycemia, hypoglycemia and associated complications. The latest …

被引用次数：157 相关文章所有 6 个版本

[PDF] mlr.press

Waveflow: A compact flow-based model for raw audio

W Ping, K Peng, K Zhao, Z Song - … Conference on Machine …, 2020 - proceedings.mlr.press

In this work, we propose WaveFlow, a small-footprint generative flow for raw audio, which is
directly trained with maximum likelihood. It handles the long-range structure of 1-D …

被引用次数：141 相关文章所有 6 个版本

[PDF] arxiv.org

Neural source-filter waveform models for statistical parametric speech synthesis

X Wang, S Takaki, J Yamagishi - IEEE/ACM Transactions on …, 2019 - ieeexplore.ieee.org

Neural waveform models have demonstrated better performance than conventional
vocoders for statistical parametric speech synthesis. One of the best models, called …

被引用次数：156 相关文章所有 6 个版本

[PDF] ceur-ws.org

[PDF][PDF] A Deep Learning Algorithm for Personalized Blood Glucose Prediction.

T Zhu, K Li, P Herrero, J Chen, P Georgiou - KDH@ IJCAI, 2018 - ceur-ws.org

A convolutional neural network (CNN) model is presented to forecast the future glucose
levels of the patients with type 1 diabetes. The model is a modified version of a recently …

被引用次数：135 相关文章所有 2 个版本