A survey on neural speech synthesis
Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural
speech given text, is a hot research topic in speech, language, and machine learning …
speech given text, is a hot research topic in speech, language, and machine learning …
It's raw! audio generation with state-space models
Developing architectures suitable for modeling raw audio is a challenging problem due to
the high sampling rates of audio waveforms. Standard sequence modeling approaches like …
the high sampling rates of audio waveforms. Standard sequence modeling approaches like …
Gansynth: Adversarial neural audio synthesis
Efficient audio synthesis is an inherently difficult machine learning task, as human
perception is sensitive to both global structure and fine-scale waveform coherence …
perception is sensitive to both global structure and fine-scale waveform coherence …
Parallel wavenet: Fast high-fidelity speech synthesis
The recently-developed WaveNet architecture is the current state of the art in realistic
speech synthesis, consistently rated as more natural sounding for many different languages …
speech synthesis, consistently rated as more natural sounding for many different languages …
Deep voice: Real-time neural text-to-speech
Abstract We present Deep Voice, a production-quality text-to-speech system constructed
entirely from deep neural networks. Deep Voice lays the groundwork for truly end-to-end …
entirely from deep neural networks. Deep Voice lays the groundwork for truly end-to-end …
MidiNet: A convolutional generative adversarial network for symbolic-domain music generation
Most existing neural network models for music generation use recurrent neural networks.
However, the recent WaveNet model proposed by DeepMind shows that convolutional …
However, the recent WaveNet model proposed by DeepMind shows that convolutional …
GluNet: A deep learning framework for accurate glucose forecasting
For people with Type 1 diabetes (T1D), forecasting of blood glucose (BG) can be used to
effectively avoid hyperglycemia, hypoglycemia and associated complications. The latest …
effectively avoid hyperglycemia, hypoglycemia and associated complications. The latest …
Waveflow: A compact flow-based model for raw audio
In this work, we propose WaveFlow, a small-footprint generative flow for raw audio, which is
directly trained with maximum likelihood. It handles the long-range structure of 1-D …
directly trained with maximum likelihood. It handles the long-range structure of 1-D …
Neural source-filter waveform models for statistical parametric speech synthesis
X Wang, S Takaki, J Yamagishi - IEEE/ACM Transactions on …, 2019 - ieeexplore.ieee.org
Neural waveform models have demonstrated better performance than conventional
vocoders for statistical parametric speech synthesis. One of the best models, called …
vocoders for statistical parametric speech synthesis. One of the best models, called …
[PDF][PDF] A Deep Learning Algorithm for Personalized Blood Glucose Prediction.
A convolutional neural network (CNN) model is presented to forecast the future glucose
levels of the patients with type 1 diabetes. The model is a modified version of a recently …
levels of the patients with type 1 diabetes. The model is a modified version of a recently …