Bangla natural language processing: A comprehensive analysis of classical, machine learning, and deep learning-based methods
The Bangla language is the seventh most spoken language, with 265 million native and non-
native speakers worldwide. However, English is the predominant language for online …
native speakers worldwide. However, English is the predominant language for online …
Audio-visual speech enhancement using conditional variational auto-encoders
M Sadeghi, S Leglaive… - … on Audio, Speech …, 2020 - ieeexplore.ieee.org
Variational auto-encoders (VAEs) are deep generative latent variable models that can be
used for learning the distribution of complex data. VAEs have been successfully used to …
used for learning the distribution of complex data. VAEs have been successfully used to …
A benchmark of dynamical variational autoencoders applied to speech spectrogram modeling
The Variational Autoencoder (VAE) is a powerful deep generative model that is now
extensively used to represent high-dimensional complex data via a low-dimensional latent …
extensively used to represent high-dimensional complex data via a low-dimensional latent …
Deep Griffin–Lim iteration: Trainable iterative phase reconstruction using neural network
In this paper, we propose a phase reconstruction framework, named Deep Griffin-Lim
Iteration (DeGLI). Phase reconstruction is a fundamental technique for improving the quality …
Iteration (DeGLI). Phase reconstruction is a fundamental technique for improving the quality …
A flow-based deep latent variable model for speech spectrogram modeling and enhancement
AA Nugraha, K Sekiguchi… - IEEE/ACM Transactions on …, 2020 - ieeexplore.ieee.org
This article describes a deep latent variable model of speech power spectrograms and its
application to semi-supervised speech enhancement with a deep speech prior. By …
application to semi-supervised speech enhancement with a deep speech prior. By …
Online phase reconstruction via DNN-based phase differences estimation
Y Masuyama, K Yatabe, K Nagatomo… - … /ACM Transactions on …, 2022 - ieeexplore.ieee.org
This paper presents a two-stage online phase reconstruction framework using causal deep
neural networks (DNNs). Phase reconstruction is a task of recovering phase of the short-time …
neural networks (DNNs). Phase reconstruction is a task of recovering phase of the short-time …
Phase reconstruction based on recurrent phase unwrapping with deep neural networks
Phase reconstruction, which estimates phase from a given amplitude spectrogram, is an
active research field in acoustical signal processing with many applications including audio …
active research field in acoustical signal processing with many applications including audio …
Inter-frequency phase difference for phase reconstruction using deep neural networks and maximum likelihood
This paper presents improvements to two-stage algorithms for estimating the short-time
Fourier transform (STFT) phase from only the amplitude by using deep neural networks …
Fourier transform (STFT) phase from only the amplitude by using deep neural networks …
A statistically principled and computationally efficient approach to speech enhancement using variational autoencoders
Recent studies have explored the use of deep generative models of speech spectra based
of variational autoencoders (VAEs), combined with unsupervised noise models, to perform …
of variational autoencoders (VAEs), combined with unsupervised noise models, to perform …
[PDF][PDF] Bangla natural language processing: A comprehensive review of classical machine learning and deep learning based methods
The Bangla language is the seventh most spoken language, with 265 million native and non-
native speakers worldwide. However, English is the predominant language for online …
native speakers worldwide. However, English is the predominant language for online …