Deep learning: Applications, architectures, models, tools, and frameworks: A comprehensive survey
M Gheisari, F Ebrahimzadeh, M Rahimi… - CAAI Transactions …, 2023 - Wiley Online Library
Deep Learning (DL) is a subfield of machine learning that significantly impacts extracting
new knowledge. By using DL, the extraction of advanced data representations and …
new knowledge. By using DL, the extraction of advanced data representations and …
Deepfakes generation and detection: State-of-the-art, open challenges, countermeasures, and way forward
Easy access to audio-visual content on social media, combined with the availability of
modern tools such as Tensorflow or Keras, and open-source trained models, along with …
modern tools such as Tensorflow or Keras, and open-source trained models, along with …
Diffwave: A versatile diffusion model for audio synthesis
In this work, we propose DiffWave, a versatile diffusion probabilistic model for conditional
and unconditional waveform generation. The model is non-autoregressive, and converts the …
and unconditional waveform generation. The model is non-autoregressive, and converts the …
Wavegrad: Estimating gradients for waveform generation
This paper introduces WaveGrad, a conditional model for waveform generation which
estimates gradients of the data density. The model is built on prior work on score matching …
estimates gradients of the data density. The model is built on prior work on score matching …
[PDF][PDF] Jukebox: A generative model for music
P Dhariwal, H Jun, C Payne, JW Kim… - arXiv preprint arXiv …, 2020 - assets.pubpub.org
We introduce Jukebox, a model that generates music with singing in the raw audio domain.
We tackle the long context of raw audio using a multiscale VQ-VAE to compress it to discrete …
We tackle the long context of raw audio using a multiscale VQ-VAE to compress it to discrete …
Synthetic Data--what, why and how?
This explainer document aims to provide an overview of the current state of the rapidly
expanding work on synthetic data technologies, with a particular focus on privacy. The …
expanding work on synthetic data technologies, with a particular focus on privacy. The …
Libritts: A corpus derived from librispeech for text-to-speech
This paper introduces a new speech corpus called" LibriTTS" designed for text-to-speech
use. It is derived from the original audio and text materials of the LibriSpeech corpus, which …
use. It is derived from the original audio and text materials of the LibriSpeech corpus, which …
Neural speech synthesis with transformer network
Although end-to-end neural text-to-speech (TTS) methods (such as Tacotron2) are proposed
and achieve state-of-theart performance, they still suffer from two problems: 1) low efficiency …
and achieve state-of-theart performance, they still suffer from two problems: 1) low efficiency …
Natural tts synthesis by conditioning wavenet on mel spectrogram predictions
This paper describes Tacotron 2, a neural network architecture for speech synthesis directly
from text. The system is composed of a recurrent sequence-to-sequence feature prediction …
from text. The system is composed of a recurrent sequence-to-sequence feature prediction …
Transfer learning from speaker verification to multispeaker text-to-speech synthesis
We describe a neural network-based system for text-to-speech (TTS) synthesis that is able to
generate speech audio in the voice of many different speakers, including those unseen …
generate speech audio in the voice of many different speakers, including those unseen …