Dual-branch attention-in-attention transformer for single-channel speech enhancement

G Yu, A Li, C Zheng, Y Guo, Y Wang… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
Curriculum learning begins to thrive in the speech enhancement area, which decouples the
original spectrum estimation task into multiple easier sub-tasks to achieve better …

Unsupervised speech enhancement using dynamical variational autoencoders

X Bie, S Leglaive, X Alameda-Pineda… - IEEE/ACM Transactions …, 2022 - ieeexplore.ieee.org
Dynamical variational autoencoders (DVAEs) are a class of deep generative models with
latent variables, dedicated to model time series of high-dimensional data. DVAEs can be …

DBT-Net: Dual-branch federative magnitude and phase estimation with attention-in-attention transformer for monaural speech enhancement

G Yu, A Li, H Wang, Y Wang, Y Ke… - IEEE/ACM Transactions …, 2022 - ieeexplore.ieee.org
The decoupling-style concept begins to ignite in the speech enhancement area, which
decouples the original complex spectrum estimation task into multiple easier sub-tasks (ie …

MetricGAN-U: Unsupervised speech enhancement/dereverberation based only on noisy/reverberated speech

SW Fu, C Yu, KH Hung, M Ravanelli… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
Most of the deep learning-based speech enhancement models are learned in a supervised
manner, which implies that pairs of noisy and clean speech are required during training …

Neural speech enhancement with unsupervised pre-training and mixture training

X Hao, C Xu, L Xie - Neural Networks, 2023 - Elsevier
Supervised neural speech enhancement methods always require a large scale of paired
noisy and clean speech data. Since collecting adequate paired data from real-world …

Exploring Multi-Stage GAN with Self-Attention for Speech Enhancement

BK Asiedu Asante, C Broni-Bediako, H Imamura - Applied sciences, 2023 - mdpi.com
Multi-stage or multi-generator generative adversarial networks (GANs) have recently been
demonstrated to be effective for speech enhancement. The existing multi-generator GANs …

Synthesizing Lithuanian voice replacement for laryngeal cancer patients with Pareto-optimized flow-based generative synthesis network

R Maskeliunas, R Damasevicius, A Kulikajevas… - Applied Acoustics, 2024 - Elsevier
This study presents a Pareto optimized flow-based generative network for speech synthesis-
the P-GLOW model in Lithuanian speech synthesis for substituting original voices affected …

Unsupervised Face-Mask Speech Enhancement Using Generative Adversarial Networks with Human-in-the-Loop Assessment Metrics

SS Wang, JY Chen, BR Bai, SH Fang… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org
The utilization of face masks is an essential healthcare measure, particularly during times of
pandemics, yet it can present challenges in communication in our daily lives. To address this …

Optimizing Shoulder to Shoulder: A coordinated sub-band fusion model for real-time full-band speech enhancement

G Yu, A Li, W Liu, C Zheng, Y Wang… - arXiv preprint arXiv …, 2022 - arxiv.org
Due to the high computational complexity to model more frequency bands, it is still
intractable to conduct real-time full-band speech enhancement based on deep neural …

Optimizing shoulder to shoulder: A coordinated sub-band fusion model for full-band speech enhancement

G Yu, A Li, W Liu, C Zheng, Y Wang… - 2022 13th International …, 2022 - ieeexplore.ieee.org
Due to the high computational complexity to model more frequency bands, it is still
intractable to conduct full-band speech enhancement based on deep neural networks …