An overview of voice conversion and its challenges: From statistical modeling to deep learning
Speaker identity is one of the important characteristics of human speech. In voice
conversion, we change the speaker identity from one to another, while keeping the linguistic …
conversion, we change the speaker identity from one to another, while keeping the linguistic …
Sixty years of frequency-domain monaural speech enhancement: From traditional to deep learning methods
Frequency-domain monaural speech enhancement has been extensively studied for over
60 years, and a great number of methods have been proposed and applied to many …
60 years, and a great number of methods have been proposed and applied to many …
[HTML][HTML] Machine learning in acoustics: Theory and applications
Acoustic data provide scientific and engineering insights in fields ranging from biology and
communications to ocean and Earth science. We survey the recent advances and …
communications to ocean and Earth science. We survey the recent advances and …
Raw waveform-based speech enhancement by fully convolutional networks
This study proposes a fully convolutional network (FCN) model for raw waveform-based
speech enhancement. The proposed system performs speech enhancement in an end-to …
speech enhancement. The proposed system performs speech enhancement in an end-to …
Interactive speech and noise modeling for speech enhancement
C Zheng, X Peng, Y Zhang, S Srinivasan… - Proceedings of the AAAI …, 2021 - ojs.aaai.org
Speech enhancement is challenging because of the diversity of background noise types.
Most of the existing methods are focused on modelling the speech rather than the noise. In …
Most of the existing methods are focused on modelling the speech rather than the noise. In …
Hawkes processes for events in social media
This chapter provides an accessible introduction for point processes, and especially Hawkes
processes, for modeling discrete, inter-dependent events over continuous time. We start by …
processes, for modeling discrete, inter-dependent events over continuous time. We start by …
Speech feature denoising and dereverberation via deep autoencoders for noisy reverberant speech recognition
Denoising autoencoders (DAs) have shown success in generating robust features for
images, but there has been limited work in applying DAs for speech. In this paper we …
images, but there has been limited work in applying DAs for speech. In this paper we …
[PDF][PDF] SNR-Aware Convolutional Neural Network Modeling for Speech Enhancement.
This paper proposes a signal-to-noise-ratio (SNR) aware convolutional neural network
(CNN) model for speech enhancement (SE). Because the CNN model can deal with local …
(CNN) model for speech enhancement (SE). Because the CNN model can deal with local …
Deep learning for video classification and captioning
Deep learning for video classification and captioning Page 1 IPART MULTIMEDIA
CONTENT ANALYSIS Page 2 Page 3 1Deep Learning for Video Classification and …
CONTENT ANALYSIS Page 2 Page 3 1Deep Learning for Video Classification and …
Speech enhancement based on teacher–student deep learning using improved speech presence probability for noise-robust speech recognition
In this paper, we propose a novel teacher-student learning framework for the preprocessing
of a speech recognizer, leveraging the online noise tracking capabilities of improved minima …
of a speech recognizer, leveraging the online noise tracking capabilities of improved minima …