A survey on neural speech synthesis

X Tan, T Qin, F Soong, TY Liu - arXiv preprint arXiv:2106.15561, 2021 - arxiv.org
Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural
speech given text, is a hot research topic in speech, language, and machine learning …

It's raw! audio generation with state-space models

K Goel, A Gu, C Donahue, C Ré - … Conference on Machine …, 2022 - proceedings.mlr.press
Developing architectures suitable for modeling raw audio is a challenging problem due to
the high sampling rates of audio waveforms. Standard sequence modeling approaches like …

Gansynth: Adversarial neural audio synthesis

J Engel, KK Agrawal, S Chen, I Gulrajani… - arXiv preprint arXiv …, 2019 - arxiv.org
Efficient audio synthesis is an inherently difficult machine learning task, as human
perception is sensitive to both global structure and fine-scale waveform coherence …

Parallel wavenet: Fast high-fidelity speech synthesis

A Oord, Y Li, I Babuschkin… - International …, 2018 - proceedings.mlr.press
The recently-developed WaveNet architecture is the current state of the art in realistic
speech synthesis, consistently rated as more natural sounding for many different languages …

Deep voice: Real-time neural text-to-speech

SÖ Arık, M Chrzanowski, A Coates… - International …, 2017 - proceedings.mlr.press
Abstract We present Deep Voice, a production-quality text-to-speech system constructed
entirely from deep neural networks. Deep Voice lays the groundwork for truly end-to-end …

MidiNet: A convolutional generative adversarial network for symbolic-domain music generation

LC Yang, SY Chou, YH Yang - arXiv preprint arXiv:1703.10847, 2017 - arxiv.org
Most existing neural network models for music generation use recurrent neural networks.
However, the recent WaveNet model proposed by DeepMind shows that convolutional …

GluNet: A deep learning framework for accurate glucose forecasting

K Li, C Liu, T Zhu, P Herrero… - IEEE journal of …, 2019 - ieeexplore.ieee.org
For people with Type 1 diabetes (T1D), forecasting of blood glucose (BG) can be used to
effectively avoid hyperglycemia, hypoglycemia and associated complications. The latest …

Waveflow: A compact flow-based model for raw audio

W Ping, K Peng, K Zhao, Z Song - … Conference on Machine …, 2020 - proceedings.mlr.press
In this work, we propose WaveFlow, a small-footprint generative flow for raw audio, which is
directly trained with maximum likelihood. It handles the long-range structure of 1-D …

Neural source-filter waveform models for statistical parametric speech synthesis

X Wang, S Takaki, J Yamagishi - IEEE/ACM Transactions on …, 2019 - ieeexplore.ieee.org
Neural waveform models have demonstrated better performance than conventional
vocoders for statistical parametric speech synthesis. One of the best models, called …

[PDF][PDF] A Deep Learning Algorithm for Personalized Blood Glucose Prediction.

T Zhu, K Li, P Herrero, J Chen, P Georgiou - KDH@ IJCAI, 2018 - ceur-ws.org
A convolutional neural network (CNN) model is presented to forecast the future glucose
levels of the patients with type 1 diabetes. The model is a modified version of a recently …