Speech enhancement and dereverberation with diffusion-based generative models

J Richter, S Welker, JM Lemercier… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org
In this work, we build upon our previous publication and use diffusion-based generative
models for speech enhancement. We present a detailed overview of the diffusion process …

PFVAE: a planar flow-based variational auto-encoder prediction model for time series data

XB Jin, WT Gong, JL Kong, YT Bai, TL Su - Mathematics, 2022 - mdpi.com
Prediction based on time series has a wide range of applications. Due to the complex
nonlinear and random distribution of time series data, the performance of learning prediction …

Speech enhancement with score-based generative models in the complex STFT domain

S Welker, J Richter, T Gerkmann - arXiv preprint arXiv:2203.17004, 2022 - arxiv.org
Score-based generative models (SGMs) have recently shown impressive results for difficult
generative tasks such as the unconditional and conditional generation of natural images …

On learning spectral masking for single channel speech enhancement using feedforward and recurrent neural networks

N Saleem, MI Khattak, M Al-Hasan, AB Qazi - IEEE Access, 2020 - ieeexplore.ieee.org
Human speech in real-world environments is typically degraded by the background noise.
They have a negative impact on perceptual speech quality and intelligibility which causes …

A flow-based neural network for time domain speech enhancement

M Strauss, B Edler - ICASSP 2021-2021 IEEE International …, 2021 - ieeexplore.ieee.org
Speech enhancement involves the distinction of a target speech signal from an intrusive
background. Although generative approaches using Variational Autoencoders or Generative …

Srtnet: Time domain speech enhancement via stochastic refinement

Z Qiu, M Fu, Y Yu, LL Yin, F Sun… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Diffusion model, as a new generative model which is very popular in image generation and
audio synthesis, is rarely used in speech enhancement. In this paper, we use the diffusion …

Variance-preserving-Based interpolation diffusion models for speech enhancement

Z Guo, J Du, CH Lee, Y Gao, W Zhang - arXiv preprint arXiv:2306.08527, 2023 - arxiv.org
The goal of this study is to implement diffusion models for speech enhancement (SE). The
first step is to emphasize the theoretical foundation of variance-preserving (VP)-based …

Flow-based independent vector analysis for blind source separation

AA Nugraha, K Sekiguchi, M Fontaine… - IEEE Signal …, 2020 - ieeexplore.ieee.org
This letter describes a time-varying extension of independent vector analysis (IVA) based on
the normalizing flow (NF), called NF-IVA, for determined blind source separation of …

Noise-aware speech enhancement using diffusion probabilistic model

Y Hu, C Chen, R Li, Q Zhu, ES Chng - arXiv preprint arXiv:2307.08029, 2023 - arxiv.org
With recent advances of diffusion model, generative speech enhancement (SE) has
attracted a surge of research interest due to its great potential for unseen testing noises …

Se-bridge: Speech enhancement with consistent brownian bridge

Z Qiu, M Fu, F Sun, G Altenbek, H Huang - arXiv preprint arXiv:2305.13796, 2023 - arxiv.org
We propose SE-Bridge, a novel method for speech enhancement (SE). After recently
applying the diffusion models to speech enhancement, we can achieve speech …