Complex-valued neural networks: A comprehensive survey
CY Lee, H Hasegawa, S Gao - IEEE/CAA Journal of …, 2022 - ieeexplore.ieee.org
Complex-valued neural networks (CVNNs) have shown their excellent efficiency compared
to their real counter-parts in speech enhancement, image and signal processing …
to their real counter-parts in speech enhancement, image and signal processing …
Audio self-supervised learning: A survey
Similar to humans' cognitive ability to generalize knowledge and skills, self-supervised
learning (SSL) targets discovering general representations from large-scale data. This …
learning (SSL) targets discovering general representations from large-scale data. This …
DCCRN: Deep complex convolution recurrent network for phase-aware speech enhancement
Speech enhancement has benefited from the success of deep learning in terms of
intelligibility and perceptual quality. Conventional time-frequency (TF) domain methods …
intelligibility and perceptual quality. Conventional time-frequency (TF) domain methods …
Icassp 2023 deep noise suppression challenge
The ICASSP 2023 Deep Noise Suppression (DNS) Challenge marks the fifth edition of the
DNS challenge series. DNS challenges were organized from 2019 to 2023 to foster …
DNS challenge series. DNS challenges were organized from 2019 to 2023 to foster …
Speech enhancement and dereverberation with diffusion-based generative models
In this work, we build upon our previous publication and use diffusion-based generative
models for speech enhancement. We present a detailed overview of the diffusion process …
models for speech enhancement. We present a detailed overview of the diffusion process …
Learning complex spectral mapping with gated convolutional recurrent networks for monaural speech enhancement
Phase is important for perceptual quality of speech. However, it seems intractable to directly
estimate phase spectra through supervised learning due to their lack of spectrotemporal …
estimate phase spectra through supervised learning due to their lack of spectrotemporal …
TSTNN: Two-stage transformer based neural network for speech enhancement in the time domain
In this paper, we propose a transformer-based architecture, called two-stage transformer
neural network (TSTNN) for end-to-end speech denoising in the time domain. The proposed …
neural network (TSTNN) for end-to-end speech denoising in the time domain. The proposed …
Metricgan: Generative adversarial networks based black-box metric scores optimization for speech enhancement
Adversarial loss in a conditional generative adversarial network (GAN) is not designed to
directly optimize evaluation metrics of a target task, and thus, may not always guide the …
directly optimize evaluation metrics of a target task, and thus, may not always guide the …
Two heads are better than one: A two-stage complex spectral mapping approach for monaural speech enhancement
For challenging acoustic scenarios as low signal-to-noise ratios, current speech
enhancement systems usually suffer from performance bottleneck in extracting the target …
enhancement systems usually suffer from performance bottleneck in extracting the target …
Speech enhancement with score-based generative models in the complex STFT domain
Score-based generative models (SGMs) have recently shown impressive results for difficult
generative tasks such as the unconditional and conditional generation of natural images …
generative tasks such as the unconditional and conditional generation of natural images …