Speech enhancement and dereverberation with diffusion-based generative models
In this work, we build upon our previous publication and use diffusion-based generative
models for speech enhancement. We present a detailed overview of the diffusion process …
models for speech enhancement. We present a detailed overview of the diffusion process …
Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis
P Ochieng - Artificial Intelligence Review, 2023 - Springer
Deep neural networks (DNN) techniques have become pervasive in domains such as
natural language processing and computer vision. They have achieved great success in …
natural language processing and computer vision. They have achieved great success in …
Deep learning-based non-intrusive multi-objective speech assessment model with cross-domain features
This study proposes a cross-domain multi-objective speech assessment model, called
MOSA-Net, which can simultaneously estimate the speech quality, intelligibility, and …
MOSA-Net, which can simultaneously estimate the speech quality, intelligibility, and …
Self-supervised visual acoustic matching
A Somayazulu, C Chen… - Advances in Neural …, 2024 - proceedings.neurips.cc
Acoustic matching aims to re-synthesize an audio clip to sound as if it were recorded in a
target acoustic environment. Existing methods assume access to paired training data, where …
target acoustic environment. Existing methods assume access to paired training data, where …
Unsupervised speech enhancement using dynamical variational autoencoders
Dynamical variational autoencoders (DVAEs) are a class of deep generative models with
latent variables, dedicated to model time series of high-dimensional data. DVAEs can be …
latent variables, dedicated to model time series of high-dimensional data. DVAEs can be …
Tea-pse 3.0: Tencent-ethereal-audio-lab personalized speech enhancement system for icassp 2023 dns-challenge
This paper introduces the Unbeatable Team's submission to the ICASSP 2023 Deep Noise
Suppression (DNS) Challenge. We expand our previous work, TEA-PSE, to its upgraded …
Suppression (DNS) Challenge. We expand our previous work, TEA-PSE, to its upgraded …
Self-Supervised Speech Quality Estimation and Enhancement Using Only Clean Speech
Speech quality estimation has recently undergone a paradigm shift from human-hearing
expert designs to machine-learning models. However, current models rely mainly on …
expert designs to machine-learning models. However, current models rely mainly on …
Integrating uncertainty into neural network-based speech enhancement
H Fang, D Becker, S Wermter… - IEEE/ACM Transactions …, 2023 - ieeexplore.ieee.org
Supervised masking approaches in the time-frequency domain aim to employ deep neural
networks to estimate a multiplicative mask to extract clean speech. This leads to a single …
networks to estimate a multiplicative mask to extract clean speech. This leads to a single …
USDnet: Unsupervised Speech Dereverberation via Neural Forward Filtering
ZQ Wang - arXiv preprint arXiv:2402.00820, 2024 - arxiv.org
In reverberant conditions with a single speaker, each far-field microphone records a
reverberant version of the same speaker signal at a different location. In over-determined …
reverberant version of the same speaker signal at a different location. In over-determined …
Hd-demucs: General speech restoration with heterogeneous decoders
This paper introduces an end-to-end neural speech restoration model, HD-DEMUCS,
demonstrating efficacy across multiple distortion environments. Unlike conventional …
demonstrating efficacy across multiple distortion environments. Unlike conventional …