A review of differentiable digital signal processing for music and speech synthesis

B Hayes, J Shier, G Fazekas, A McPherson… - Frontiers in Signal …, 2024 - frontiersin.org
The term “differentiable digital signal processing” describes a family of techniques in which
loss function gradients are backpropagated through digital signal processors, facilitating …

Investigations on speaker adaptation using a continuous vocoder within recurrent neural network based text-to-speech synthesis

AR Mandeel, MS Al-Radhi, TG Csapó - Multimedia Tools and Applications, 2023 - Springer
This paper presents an investigation of speaker adaptation using a continuous vocoder for
parametric text-to-speech (TTS) synthesis. In purposes that demand low computational …

End-to-end LPCNet: A neural vocoder with fully-differentiable LPC estimation

K Subramani, JM Valin, U Isik, P Smaragdis… - arXiv preprint arXiv …, 2022 - arxiv.org
Neural vocoders have recently demonstrated high quality speech synthesis, but typically
require a high computational complexity. LPCNet was proposed as a way to reduce the …

Unsupervised music source separation using differentiable parametric source models

K Schulze-Forster, G Richard, L Kelley… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org
Supervised deep learning approaches to underdetermined audio source separation achieve
state-of-the-art performance but require a dataset of mixtures along with their corresponding …

[HTML][HTML] Embodying rather than encoding: Towards developing a source-filter theory for undulation gait generation

L Li, S Ma, I Tokuda, Z Liu, Z Ma, Y Tian… - Biomimetic Intelligence and …, 2024 - Elsevier
Biological undulation enables legless creatures to move naturally, and robustly in various
environments. Consequently, many kinds of undulating robots have been developed …

Differentiable Time-Varying Linear Prediction in the Context of End-to-End Analysis-by-Synthesis

CY Yu, G Fazekas - arXiv preprint arXiv:2406.05128, 2024 - arxiv.org
Training the linear prediction (LP) operator end-to-end for audio synthesis in modern deep
learning frameworks is slow due to its recursive formulation. In addition, frame-wise …

Differentiable All-pole Filters for Time-varying Audio Systems

CY Yu, C Mitcheltree, A Carson, S Bilbao… - arXiv preprint arXiv …, 2024 - arxiv.org
Infinite impulse response filters are an essential building block of many time-varying audio
systems, such as audio effects and synthesisers. However, their recursive structure impedes …

Investigations on speaker adaptation using a continuous vocoder within recurrent neural network based text-to-speech synthesis

M Ali Raheem, AR Mohammed Salah, C Tamás Gábor - 2023 - dlib.phenikaa-uni.edu.vn
This paper presents an investigation of speaker adaptation using a continuous vocoder for
parametric text-to-speech (TTS) synthesis. In purposes that demand low computational …

Informed audio source separation with deep learning in limited data settings

K Schulze-Forster - 2021 - theses.hal.science
Audio source separation is the task of estimating the individual signals of several sound
sources when only their mixture can be observed. State-of-the-art performance for musical …