A review of differentiable digital signal processing for music and speech synthesis
The term “differentiable digital signal processing” describes a family of techniques in which
loss function gradients are backpropagated through digital signal processors, facilitating …
loss function gradients are backpropagated through digital signal processors, facilitating …
Investigations on speaker adaptation using a continuous vocoder within recurrent neural network based text-to-speech synthesis
This paper presents an investigation of speaker adaptation using a continuous vocoder for
parametric text-to-speech (TTS) synthesis. In purposes that demand low computational …
parametric text-to-speech (TTS) synthesis. In purposes that demand low computational …
End-to-end LPCNet: A neural vocoder with fully-differentiable LPC estimation
Neural vocoders have recently demonstrated high quality speech synthesis, but typically
require a high computational complexity. LPCNet was proposed as a way to reduce the …
require a high computational complexity. LPCNet was proposed as a way to reduce the …
Unsupervised music source separation using differentiable parametric source models
K Schulze-Forster, G Richard, L Kelley… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org
Supervised deep learning approaches to underdetermined audio source separation achieve
state-of-the-art performance but require a dataset of mixtures along with their corresponding …
state-of-the-art performance but require a dataset of mixtures along with their corresponding …
[HTML][HTML] Embodying rather than encoding: Towards developing a source-filter theory for undulation gait generation
Biological undulation enables legless creatures to move naturally, and robustly in various
environments. Consequently, many kinds of undulating robots have been developed …
environments. Consequently, many kinds of undulating robots have been developed …
Differentiable Time-Varying Linear Prediction in the Context of End-to-End Analysis-by-Synthesis
CY Yu, G Fazekas - arXiv preprint arXiv:2406.05128, 2024 - arxiv.org
Training the linear prediction (LP) operator end-to-end for audio synthesis in modern deep
learning frameworks is slow due to its recursive formulation. In addition, frame-wise …
learning frameworks is slow due to its recursive formulation. In addition, frame-wise …
Differentiable All-pole Filters for Time-varying Audio Systems
Infinite impulse response filters are an essential building block of many time-varying audio
systems, such as audio effects and synthesisers. However, their recursive structure impedes …
systems, such as audio effects and synthesisers. However, their recursive structure impedes …
Investigations on speaker adaptation using a continuous vocoder within recurrent neural network based text-to-speech synthesis
M Ali Raheem, AR Mohammed Salah, C Tamás Gábor - 2023 - dlib.phenikaa-uni.edu.vn
This paper presents an investigation of speaker adaptation using a continuous vocoder for
parametric text-to-speech (TTS) synthesis. In purposes that demand low computational …
parametric text-to-speech (TTS) synthesis. In purposes that demand low computational …
Informed audio source separation with deep learning in limited data settings
K Schulze-Forster - 2021 - theses.hal.science
Audio source separation is the task of estimating the individual signals of several sound
sources when only their mixture can be observed. State-of-the-art performance for musical …
sources when only their mixture can be observed. State-of-the-art performance for musical …