Generative adversarial networks for speech processing: A review

A Wali, Z Alamgir, S Karim, A Fawaz, MB Ali… - Computer Speech & …, 2022 - Elsevier
Generative adversarial networks (GANs) have seen remarkable progress in recent years.
They are used as generative models for all kinds of data such as text, images, audio, music …

Real time speech enhancement in the waveform domain

A Defossez, G Synnaeve, Y Adi - arXiv preprint arXiv:2006.12847, 2020 - arxiv.org
We present a causal speech enhancement model working on the raw waveform that runs in
real-time on a laptop CPU. The proposed model is based on an encoder-decoder …

Sergan: Speech enhancement using relativistic generative adversarial networks with gradient penalty

D Baby, S Verhulst - ICASSP 2019-2019 IEEE international …, 2019 - ieeexplore.ieee.org
Popular neural network-based speech enhancement systems operate on the magnitude
spectrogram and ignore the phase mismatch between the noisy and clean speech signals …

Deep neural networks for automatic speech processing: a survey from large corpora to limited data

V Roger, J Farinas, J Pinquier - EURASIP Journal on Audio, Speech, and …, 2022 - Springer
Most state-of-the-art speech systems use deep neural networks (DNNs). These systems
require a large amount of data to be learned. Hence, training state-of-the-art frameworks on …

A parallel-data-free speech enhancement method using multi-objective learning cycle-consistent generative adversarial network

Y Xiang, C Bao - IEEE/ACM Transactions on Audio, Speech …, 2020 - ieeexplore.ieee.org
Recently, deep neural networks (DNNs) have become the mainstream strategy for speech
enhancement task because it can achieve the higher speech quality and intelligibility than …

Adversarial regularization for attention based end-to-end robust speech recognition

S Sun, P Guo, L Xie, MY Hwang - IEEE/ACM Transactions on …, 2019 - ieeexplore.ieee.org
End-to-end speech recognition, such as attention based approaches, is an emerging and
attractive topic in recent years. It has achieved comparable performance with the traditional …

Adversarial joint training with self-attention mechanism for robust end-to-end speech recognition

L Li, Y Kang, Y Shi, L Kürzinger, T Watzel… - EURASIP Journal on …, 2021 - Springer
Lately, the self-attention mechanism has marked a new milestone in the field of automatic
speech recognition (ASR). Nevertheless, its performance is susceptible to environmental …

SpecAugment impact on automatic speaker verification system

MY Faisal, S Suyanto - 2019 international seminar on research …, 2019 - ieeexplore.ieee.org
An automatic speaker verification (ASV) is one of the challenging problem in speech
processing since there are so many models of machine learnings those capable of …

Speech enhancement using forked generative adversarial networks with spectral subtraction

J Lin, S Niu, Z Wei, X Lan, AJ Wijngaarden… - … of Interspeech 2019, 2019 - par.nsf.gov
Speech enhancement techniques that use a generative adversarial network (GAN) can
effectively suppress noise while allowing models to be trained end-to-end. However, such …

Synthesis speech based data augmentation for low resource children ASR

V Kadyan, H Kathania, P Govil, M Kurimo - Speech and Computer: 23rd …, 2021 - Springer
Successful speech recognition for children requires large training data with sufficient
speaker variability. The collection of such a training database of children's voices is …