The multi-channel wall street journal audio visual corpus (MC-WSJ-AV): Specification and...

J Richter, S Welker, JM Lemercier… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org

In this work, we build upon our previous publication and use diffusion-based generative
models for speech enhancement. We present a detailed overview of the diffusion process …

被引用次数：166 相关文章所有 4 个版本

[PDF] hal.science

The third 'CHiME'speech separation and recognition challenge: Dataset, task and baselines

J Barker, R Marxer, E Vincent… - 2015 IEEE Workshop on …, 2015 - ieeexplore.ieee.org

The CHiME challenge series aims to advance far field speech recognition technology by
promoting research at the interface of signal processing and automatic speech recognition …

被引用次数：789 相关文章所有 13 个版本

[PDF] springer.com

A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research

K Kinoshita, M Delcroix, S Gannot, EA P. Habets… - EURASIP Journal on …, 2016 - Springer

In recent years, substantial progress has been made in the field of reverberant speech
signal processing, including both single-and multichannel dereverberation techniques and …

被引用次数：423 相关文章所有 15 个版本

[PDF] academia.edu

The REVERB challenge: A common evaluation framework for dereverberation and recognition of reverberant speech

K Kinoshita, M Delcroix, T Yoshioka… - … IEEE Workshop on …, 2013 - ieeexplore.ieee.org

Recently, substantial progress has been made in the field of reverberant speech signal
processing, including both single-and multichannel dereverberation techniques, and …

被引用次数：471 相关文章所有 9 个版本

[PDF] arxiv.org

Building and evaluation of a real room impulse response dataset

I Szöke, M Skácel, L Mošner, J Paliesek… - IEEE Journal of …, 2019 - ieeexplore.ieee.org

This paper presents BUT ReverbDB-a dataset of real room impulse responses (RIR),
background noises, and retransmitted speech data. The retransmitted data include …

被引用次数：159 相关文章所有 5 个版本

[图书][B] Distant speech recognition

M Wölfel, J McDonough - 2009 - books.google.com

A complete overview of distant automatic speech recognition The performance of
conventional Automatic Speech Recognition (ASR) systems degrades dramatically as soon …

被引用次数：455 相关文章所有 6 个版本

[PDF] arxiv.org

Deep learning based multi-source localization with source splitting and its effectiveness in multi-talker speech recognition

AS Subramanian, C Weng, S Watanabe, M Yu… - Computer Speech & …, 2022 - Elsevier

Multi-source localization is an important and challenging technique for multi-talker
conversation analysis. This paper proposes a novel supervised learning method using deep …

被引用次数：81 相关文章所有 5 个版本

[PDF] ieee.org

Monaural speech dereverberation using temporal convolutional networks with self attention

Y Zhao, DL Wang, B Xu, T Zhang - IEEE/ACM transactions on …, 2020 - ieeexplore.ieee.org

In daily listening environments, human speech is often degraded by room reverberation,
especially under highly reverberant conditions. Such degradation poses a challenge for …

被引用次数：101 相关文章所有 7 个版本

[PDF] arxiv.org

Cmgan: Conformer-based metric-gan for monaural speech enhancement

S Abdulatif, R Cao, B Yang - IEEE/ACM Transactions on Audio …, 2024 - ieeexplore.ieee.org

In this work, we further develop the conformer-based metric generative adversarial network
(CMGAN) model 1 for speech enhancement (SE) in the time-frequency (TF) domain. This …

被引用次数：48 相关文章所有 8 个版本

[PDF] epfl.ch

On the vulnerability of speaker verification to realistic voice spoofing

SK Ergünay, E Khoury, A Lazaridis… - 2015 IEEE 7th …, 2015 - ieeexplore.ieee.org

Automatic speaker verification (ASV) systems are subject to various kinds of malicious
attacks. Replay, voice conversion and speech synthesis attacks drastically degrade the …

被引用次数：160 相关文章所有 12 个版本