Generative adversarial networks for speech processing: A review
Generative adversarial networks (GANs) have seen remarkable progress in recent years.
They are used as generative models for all kinds of data such as text, images, audio, music …
They are used as generative models for all kinds of data such as text, images, audio, music …
CMGAN: Conformer-based metric GAN for speech enhancement
R Cao, S Abdulatif, B Yang - arXiv preprint arXiv:2203.15149, 2022 - arxiv.org
Recently, convolution-augmented transformer (Conformer) has achieved promising
performance in automatic speech recognition (ASR) and time-domain speech enhancement …
performance in automatic speech recognition (ASR) and time-domain speech enhancement …
Systematic review of advanced AI methods for improving healthcare data quality in post COVID-19 Era
At the beginning of the COVID-19 pandemic, there was significant hype about the potential
impact of artificial intelligence (AI) tools in combatting COVID-19 on diagnosis, prognosis, or …
impact of artificial intelligence (AI) tools in combatting COVID-19 on diagnosis, prognosis, or …
Cmgan: Conformer-based metric-gan for monaural speech enhancement
S Abdulatif, R Cao, B Yang - IEEE/ACM Transactions on Audio …, 2024 - ieeexplore.ieee.org
In this work, we further develop the conformer-based metric generative adversarial network
(CMGAN) model 1 for speech enhancement (SE) in the time-frequency (TF) domain. This …
(CMGAN) model 1 for speech enhancement (SE) in the time-frequency (TF) domain. This …
ComposeInStyle: Music composition with and without Style Transfer
S Mukherjee, M Mulimani - Expert Systems with Applications, 2022 - Elsevier
Every music composition has a composer at the core of its building block, molding it into a
style of their own. The creative compositional style of a composer varies dynamically with …
style of their own. The creative compositional style of a composer varies dynamically with …
Learning to denoise historical music
We propose an audio-to-audio neural network model that learns to denoise old music
recordings. Our model internally converts its input into a time-frequency representation by …
recordings. Our model internally converts its input into a time-frequency representation by …
μ-law SGAN for generating spectra with more details in speech enhancement
The goal of monaural speech enhancement is to separate clean speech from noisy speech.
Recently, many studies have employed generative adversarial networks (GAN) to deal with …
Recently, many studies have employed generative adversarial networks (GAN) to deal with …
Single channel speech enhancement by colored spectrograms
Speech enhancement concerns the processes required to remove unwanted background
sounds from the target speech to improve its quality and intelligibility. In this paper, a novel …
sounds from the target speech to improve its quality and intelligibility. In this paper, a novel …
Audio denoising for robust audio fingerprinting
K Akesbi - arXiv preprint arXiv:2212.11277, 2022 - arxiv.org
Music discovery services let users identify songs from short mobile recordings. These
solutions are often based on Audio Fingerprinting, and rely more specifically on the …
solutions are often based on Audio Fingerprinting, and rely more specifically on the …
Investigating cross-domain losses for speech enhancement
S Abdulatif, K Armanious, JT Sajeev… - 2021 29th European …, 2021 - ieeexplore.ieee.org
Recent years have seen a surge in the number of available frameworks for speech
enhancement (SE) and recognition. Whether model-based or constructed via deep learning …
enhancement (SE) and recognition. Whether model-based or constructed via deep learning …