Attention-inspired artificial neural networks for speech processing: A systematic review

N Zacarias-Morales, P Pancardo… - Symmetry, 2021 - mdpi.com
Artificial Neural Networks (ANNs) were created inspired by the neural networks in the
human brain and have been widely applied in speech processing. The application areas of …

Monaural speech enhancement with complex convolutional block attention module and joint time frequency losses

S Zhao, TH Nguyen, B Ma - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
Deep complex U-Net structure and convolutional recurrent network (CRN) structure achieve
state-of-the-art performance for monaural speech enhancement. Both deep complex U-Net …

Speech enhancement algorithm based on a convolutional neural network reconstruction of the temporal envelope of speech in noisy environments

R Soleymanpour, M Soleymanpour, AJ Brammer… - IEEE …, 2023 - ieeexplore.ieee.org
Temporal modulation processing is a promising technique for improving the intelligibility and
quality of speech in noise. We propose a speech enhancement algorithm that constructs the …

Towards more efficient DNN-based speech enhancement using quantized correlation mask

S Abdullah, M Zamani, A Demosthenous - IEEE Access, 2021 - ieeexplore.ieee.org
Many studies on deep learning-based speech enhancement (SE) utilizing the computational
auditory scene analysis method typically employs the ideal binary mask or the ideal ratio …

Restoring degraded speech via a modified diffusion model

J Zhang, S Jayasuriya, V Berisha - arXiv preprint arXiv:2104.11347, 2021 - arxiv.org
There are many deterministic mathematical operations (eg compression, clipping,
downsampling) that degrade speech quality considerably. In this paper we introduce a …

Diagnosis of exercise-induced cardiac fatigue based on deep learning and heart sounds

C Yin, X Zhou, Y Zhao, Y Zheng, Y Shi, X Yan, X Guo - Applied Acoustics, 2022 - Elsevier
Exercised-induced cardiac fatigue (EICF) refers to an impermanent decline in systolic and
diastolic function caused by high-intensity and multi-frequency exercise. Long-term EICF …

DPHT-ANet: Dual-path high-order transformer-style fully attentional network for monaural speech enhancement

N Saleem, S Bourouis, H Elmannai, AD Algarni - Applied Acoustics, 2024 - Elsevier
Dual-path Transformer-style models have demonstrated significant effectiveness in speech
enhancement. However, extensive parameterization and computational complexity present …

Efficient audio-visual speech enhancement using deep U-Net with early fusion of audio and video information and RNN attention blocks

JW Hwang, RH Park, HM Park - IEEE Access, 2021 - ieeexplore.ieee.org
Speech enhancement (SE) aims to improve speech quality and intelligibility by removing
acoustic corruption. While various SE models using audio-only (AO) based on deep learning …

Improved relativistic cycle-consistent gan with dilated residual network and multi-attention for speech enhancement

Y Wang, G Yu, J Wang, H Wang, Q Zhang - IEEE Access, 2020 - ieeexplore.ieee.org
Generative adversarial networks (GANs) have been increasingly used as feature mapping
functions in speech enhancement, in which the noisy speech features are transformed to the …

Enhancement of single channel speech quality and intelligibility in multiple noise conditions using wiener filter and deep CNN

D Hepsiba, J Justin - Soft Computing, 2022 - Springer
Nowadays, deep neural network has become the prime approach for enhancing speech
signals as it yields good results compared to the traditional methods. This paper describes …