An end-to-end architecture of online multi-channel speech separation

F Yu, S Zhang, P Guo, Y Fu, Z Du… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

The ICASSP 2022 Multi-channel Multi-party Meeting Transcription Grand Challenge
(M2MeT) focuses on one of the most valuable and the most challenging scenarios of speech …

被引用次数：30 相关文章所有 5 个版本

[PDF] arxiv.org

Implicit neural spatial filtering for multichannel source separation in the waveform domain

D Markovic, A Defossez, A Richard - arXiv preprint arXiv:2206.15423, 2022 - arxiv.org

We present a single-stage casual waveform-to-waveform multichannel model that can
separate moving sound sources based on their broad spatial locations in a dynamic …

被引用次数：17 相关文章所有 4 个版本

[PDF] arxiv.org

L-spex: Localized target speaker extraction

M Ge, C Xu, L Wang, ES Chng… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

Speaker extraction aims to extract the target speaker's voice from a multi-talker speech
mixture given an auxiliary reference utterance. Recent studies show that speaker extraction …

被引用次数：27 相关文章所有 3 个版本

[PDF] arxiv.org

Desnet: A multi-channel network for simultaneous speech dereverberation, enhancement and separation

Y Fu, J Wu, Y Hu, M Xing, L Xie - 2021 IEEE Spoken Language …, 2021 - ieeexplore.ieee.org

In this paper, we propose a multi-channel network for simultaneous speech dereverberation,
enhancement and separation (DESNet). To enable gradient propagation and joint …

被引用次数：30 相关文章所有 3 个版本

[PDF] arxiv.org

A comparative study on speaker-attributed automatic speech recognition in multi-party meetings

F Yu, Z Du, S Zhang, Y Lin, L Xie - arXiv preprint arXiv:2203.16834, 2022 - arxiv.org

In this paper, we conduct a comparative study on speaker-attributed automatic speech
recognition (SA-ASR) in the multi-party meeting scenario, a topic with increasing attention in …

被引用次数：14 相关文章所有 6 个版本

[PDF] arxiv.org

Ba-sot: Boundary-aware serialized output training for multi-talker asr

Y Liang, F Yu, Y Li, P Guo, S Zhang, Q Chen… - arXiv preprint arXiv …, 2023 - arxiv.org

The recently proposed serialized output training (SOT) simplifies multi-talker automatic
speech recognition (ASR) by generating speaker transcriptions separated by a special …

被引用次数：8 相关文章所有 5 个版本

[PDF] mdpi.com

A neural beamspace-domain filter for real-time multi-channel speech enhancement

W Liu, A Li, X Wang, M Yuan, Y Chen, C Zheng, X Li - Symmetry, 2022 - mdpi.com

Most deep-learning-based multi-channel speech enhancement methods focus on designing
a set of beamforming coefficients, to directly filter the low signal-to-noise ratio signals …

被引用次数：10 相关文章所有 5 个版本

[PDF] arxiv.org

Investigation of practical aspects of single channel speech separation for ASR

J Wu, Z Chen, S Chen, Y Wu, T Yoshioka… - arXiv preprint arXiv …, 2021 - arxiv.org

Speech separation has been successfully applied as a frontend processing module of
conversation transcription systems thanks to its ability to handle overlapped speech and its …

被引用次数：18 相关文章所有 6 个版本

A separation and interaction framework for causal multi-channel speech enhancement

W Liu, A Li, C Zheng, X Li - Digital Signal Processing, 2022 - Elsevier

Multi-channel speech enhancement aims at extracting the desired speech using a
microphone array, which has many potential applications, such as video conferencing …

被引用次数：7 相关文章所有 2 个版本

Streaming Multi-Channel Speech Separation with Online Time-Domain Generalized Wiener Filter

Y Luo - ICASSP 2023-2023 IEEE International Conference on …, 2023 - ieeexplore.ieee.org

Most existing streaming neural-network-based multi-channel speech separation systems
consist of a causal network architecture and an online spatial information extraction module …

被引用次数：1 相关文章