Sixty years of frequency-domain monaural speech enhancement: From traditional to deep learning methods

C Zheng, H Zhang, W Liu, X Luo, A Li, X Li… - Trends in …, 2023 - journals.sagepub.com
Frequency-domain monaural speech enhancement has been extensively studied for over
60 years, and a great number of methods have been proposed and applied to many …

A survey of audio enhancement algorithms for music, speech, bioacoustics, biomedical, industrial and environmental sounds by image U-Net

S Gul, MS Khan - IEEE Access, 2023 - ieeexplore.ieee.org
The recent surge in the use of Deep Neural Networks (DNNs) has also made its mark in the
field of Audio Enhancement (AE), providing much better quality than the classical methods …

Fast spectrogram inversion using multi-head convolutional neural networks

SÖ Arık, H Jun, G Diamos - IEEE Signal Processing Letters, 2018 - ieeexplore.ieee.org
We propose the multi-head convolutional neural network (MCNN) for waveform synthesis
from spectrograms. Nonlinear interpolation in MCNN is employed with transposed …

Adversarial generation of time-frequency features with application in audio synthesis

A Marafioti, N Perraudin… - … on machine learning, 2019 - proceedings.mlr.press
Time-frequency (TF) representations provide powerful and intuitive features for the analysis
of time series such as audio. But still, generative modeling of audio in the TF domain is a …

A context encoder for audio inpainting

A Marafioti, N Perraudin, N Holighaus… - … /ACM Transactions on …, 2019 - ieeexplore.ieee.org
In this article, we study the ability of deep neural networks (DNNs) to restore missing audio
content based on its context, ie, inpaint audio gaps. We focus on a condition which has not …

Research of planetary gear fault diagnosis based on permutation entropy of CEEMDAN and ANFIS

M Kuai, G Cheng, Y Pang, Y Li - Sensors, 2018 - mdpi.com
For planetary gear has the characteristics of small volume, light weight and large
transmission ratio, it is widely used in high speed and high power mechanical system. Poor …

GACELA: A generative adversarial context encoder for long audio inpainting of music

A Marafioti, P Majdak, N Holighaus… - IEEE Journal of …, 2020 - ieeexplore.ieee.org
In this article, we introduce GACELA, a conditional generative adversarial network (cGAN)
designed to restore missing audio data with durations ranging between hundreds of …

[PDF][PDF] A comparison of recent neural vocoders for speech signal reconstruction

P Govalkar, J Fischer, F Zalkow… - Proc. 10th ISCA speech …, 2019 - isca-archive.org
In recent years, text-to-speech (TTS) synthesis has benefited from advanced machine
learning approaches. Most prominently, since the introduction of the WaveNet architecture …

Deep Griffin–Lim iteration: Trainable iterative phase reconstruction using neural network

Y Masuyama, K Yatabe, Y Koizumi… - IEEE Journal of …, 2020 - ieeexplore.ieee.org
In this paper, we propose a phase reconstruction framework, named Deep Griffin-Lim
Iteration (DeGLI). Phase reconstruction is a fundamental technique for improving the quality …

Griffin–Lim like phase recovery via alternating direction method of multipliers

Y Masuyama, K Yatabe… - IEEE Signal Processing …, 2018 - ieeexplore.ieee.org
Recovering a signal from its amplitude spectrogram, or phase recovery, exhibits many
applications in acoustic signal processing. When only an amplitude spectrogram is available …