Speech enhancement and dereverberation with diffusion-based generative models
In this work, we build upon our previous publication and use diffusion-based generative
models for speech enhancement. We present a detailed overview of the diffusion process …
models for speech enhancement. We present a detailed overview of the diffusion process …
The third 'CHiME'speech separation and recognition challenge: Dataset, task and baselines
The CHiME challenge series aims to advance far field speech recognition technology by
promoting research at the interface of signal processing and automatic speech recognition …
promoting research at the interface of signal processing and automatic speech recognition …
A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research
In recent years, substantial progress has been made in the field of reverberant speech
signal processing, including both single-and multichannel dereverberation techniques and …
signal processing, including both single-and multichannel dereverberation techniques and …
The REVERB challenge: A common evaluation framework for dereverberation and recognition of reverberant speech
Recently, substantial progress has been made in the field of reverberant speech signal
processing, including both single-and multichannel dereverberation techniques, and …
processing, including both single-and multichannel dereverberation techniques, and …
Building and evaluation of a real room impulse response dataset
This paper presents BUT ReverbDB-a dataset of real room impulse responses (RIR),
background noises, and retransmitted speech data. The retransmitted data include …
background noises, and retransmitted speech data. The retransmitted data include …
[图书][B] Distant speech recognition
M Wölfel, J McDonough - 2009 - books.google.com
A complete overview of distant automatic speech recognition The performance of
conventional Automatic Speech Recognition (ASR) systems degrades dramatically as soon …
conventional Automatic Speech Recognition (ASR) systems degrades dramatically as soon …
Deep learning based multi-source localization with source splitting and its effectiveness in multi-talker speech recognition
Multi-source localization is an important and challenging technique for multi-talker
conversation analysis. This paper proposes a novel supervised learning method using deep …
conversation analysis. This paper proposes a novel supervised learning method using deep …
Monaural speech dereverberation using temporal convolutional networks with self attention
In daily listening environments, human speech is often degraded by room reverberation,
especially under highly reverberant conditions. Such degradation poses a challenge for …
especially under highly reverberant conditions. Such degradation poses a challenge for …
Cmgan: Conformer-based metric-gan for monaural speech enhancement
S Abdulatif, R Cao, B Yang - IEEE/ACM Transactions on Audio …, 2024 - ieeexplore.ieee.org
In this work, we further develop the conformer-based metric generative adversarial network
(CMGAN) model 1 for speech enhancement (SE) in the time-frequency (TF) domain. This …
(CMGAN) model 1 for speech enhancement (SE) in the time-frequency (TF) domain. This …
On the vulnerability of speaker verification to realistic voice spoofing
Automatic speaker verification (ASV) systems are subject to various kinds of malicious
attacks. Replay, voice conversion and speech synthesis attacks drastically degrade the …
attacks. Replay, voice conversion and speech synthesis attacks drastically degrade the …