Speech enhancement and dereverberation with diffusion-based generative models

J Richter, S Welker, JM Lemercier… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org
In this work, we build upon our previous publication and use diffusion-based generative
models for speech enhancement. We present a detailed overview of the diffusion process …

The third 'CHiME'speech separation and recognition challenge: Dataset, task and baselines

J Barker, R Marxer, E Vincent… - 2015 IEEE Workshop on …, 2015 - ieeexplore.ieee.org
The CHiME challenge series aims to advance far field speech recognition technology by
promoting research at the interface of signal processing and automatic speech recognition …

A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research

K Kinoshita, M Delcroix, S Gannot, EA P. Habets… - EURASIP Journal on …, 2016 - Springer
In recent years, substantial progress has been made in the field of reverberant speech
signal processing, including both single-and multichannel dereverberation techniques and …

The REVERB challenge: A common evaluation framework for dereverberation and recognition of reverberant speech

K Kinoshita, M Delcroix, T Yoshioka… - … IEEE Workshop on …, 2013 - ieeexplore.ieee.org
Recently, substantial progress has been made in the field of reverberant speech signal
processing, including both single-and multichannel dereverberation techniques, and …

Building and evaluation of a real room impulse response dataset

I Szöke, M Skácel, L Mošner, J Paliesek… - IEEE Journal of …, 2019 - ieeexplore.ieee.org
This paper presents BUT ReverbDB-a dataset of real room impulse responses (RIR),
background noises, and retransmitted speech data. The retransmitted data include …

[图书][B] Distant speech recognition

M Wölfel, J McDonough - 2009 - books.google.com
A complete overview of distant automatic speech recognition The performance of
conventional Automatic Speech Recognition (ASR) systems degrades dramatically as soon …

Deep learning based multi-source localization with source splitting and its effectiveness in multi-talker speech recognition

AS Subramanian, C Weng, S Watanabe, M Yu… - Computer Speech & …, 2022 - Elsevier
Multi-source localization is an important and challenging technique for multi-talker
conversation analysis. This paper proposes a novel supervised learning method using deep …

Monaural speech dereverberation using temporal convolutional networks with self attention

Y Zhao, DL Wang, B Xu, T Zhang - IEEE/ACM transactions on …, 2020 - ieeexplore.ieee.org
In daily listening environments, human speech is often degraded by room reverberation,
especially under highly reverberant conditions. Such degradation poses a challenge for …

Cmgan: Conformer-based metric-gan for monaural speech enhancement

S Abdulatif, R Cao, B Yang - IEEE/ACM Transactions on Audio …, 2024 - ieeexplore.ieee.org
In this work, we further develop the conformer-based metric generative adversarial network
(CMGAN) model 1 for speech enhancement (SE) in the time-frequency (TF) domain. This …

On the vulnerability of speaker verification to realistic voice spoofing

SK Ergünay, E Khoury, A Lazaridis… - 2015 IEEE 7th …, 2015 - ieeexplore.ieee.org
Automatic speaker verification (ASV) systems are subject to various kinds of malicious
attacks. Replay, voice conversion and speech synthesis attacks drastically degrade the …