Metricgan+: An improved version of metricgan for speech enhancement
The discrepancy between the cost function used for training a speech enhancement model
and human auditory perception usually makes the quality of enhanced speech …
and human auditory perception usually makes the quality of enhanced speech …
Boosting self-supervised embeddings for speech enhancement
Self-supervised learning (SSL) representation for speech has achieved state-of-the-art
(SOTA) performance on several downstream tasks. However, there remains room for …
(SOTA) performance on several downstream tasks. However, there remains room for …
Perceptual contrast stretching on target feature for speech enhancement
Speech enhancement (SE) performance has improved considerably owing to the use of
deep learning models as a base function. Herein, we propose a perceptual contrast …
deep learning models as a base function. Herein, we propose a perceptual contrast …
Audio-visual speech enhancement using self-supervised learning to improve speech intelligibility in cochlear implant simulations
Individuals with hearing impairments face challenges in their ability to comprehend speech,
particularly in noisy environments. The aim of this study is to explore the effectiveness of …
particularly in noisy environments. The aim of this study is to explore the effectiveness of …
An empirical study on the impact of positional encoding in transformer-based monaural speech enhancement
Transformer architecture has enabled recent progress in speech enhancement. Since
Transformers are position-agostic, positional encoding is the de facto standard component …
Transformers are position-agostic, positional encoding is the de facto standard component …
Transformers with competitive ensembles of independent mechanisms
An important development in deep learning from the earliest MLPs has been a move
towards architectures with structural inductive biases which enable the model to keep …
towards architectures with structural inductive biases which enable the model to keep …
An Investigation of Incorporating Mamba for Speech Enhancement
R Chao, WH Cheng, M La Quatra… - arXiv preprint arXiv …, 2024 - arxiv.org
This work aims to study a scalable state-space model (SSM), Mamba, for the speech
enhancement (SE) task. We exploit a Mamba-based regression model to characterize …
enhancement (SE) task. We exploit a Mamba-based regression model to characterize …
Vset: A multimodal transformer for visual speech enhancement
The transformer architecture has shown great capability in learning long-term dependency
and works well in multiple domains. However, transformer has been less considered in …
and works well in multiple domains. However, transformer has been less considered in …
Improving character error rate is not equal to having clean speech: Speech enhancement for asr systems with black-box acoustic models
R Sawata, Y Kashiwagi… - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
A deep neural network (DNN)-based speech enhancement (SE) aiming to maximize the
performance of an automatic speech recognition (ASR) system is proposed in this paper. In …
performance of an automatic speech recognition (ASR) system is proposed in this paper. In …
OSSEM: one-shot speaker adaptive speech enhancement using meta learning
Although deep learning (DL) has achieved notable progress in speech enhancement (SE),
further research is still required for a DL-based SE system to adapt effectively and efficiently …
further research is still required for a DL-based SE system to adapt effectively and efficiently …