Summary on the ICASSP 2022 multi-channel multi-party meeting transcription grand challenge
The ICASSP 2022 Multi-channel Multi-party Meeting Transcription Grand Challenge
(M2MeT) focuses on one of the most valuable and the most challenging scenarios of speech …
(M2MeT) focuses on one of the most valuable and the most challenging scenarios of speech …
Implicit neural spatial filtering for multichannel source separation in the waveform domain
We present a single-stage casual waveform-to-waveform multichannel model that can
separate moving sound sources based on their broad spatial locations in a dynamic …
separate moving sound sources based on their broad spatial locations in a dynamic …
L-spex: Localized target speaker extraction
Speaker extraction aims to extract the target speaker's voice from a multi-talker speech
mixture given an auxiliary reference utterance. Recent studies show that speaker extraction …
mixture given an auxiliary reference utterance. Recent studies show that speaker extraction …
Desnet: A multi-channel network for simultaneous speech dereverberation, enhancement and separation
In this paper, we propose a multi-channel network for simultaneous speech dereverberation,
enhancement and separation (DESNet). To enable gradient propagation and joint …
enhancement and separation (DESNet). To enable gradient propagation and joint …
A comparative study on speaker-attributed automatic speech recognition in multi-party meetings
In this paper, we conduct a comparative study on speaker-attributed automatic speech
recognition (SA-ASR) in the multi-party meeting scenario, a topic with increasing attention in …
recognition (SA-ASR) in the multi-party meeting scenario, a topic with increasing attention in …
Ba-sot: Boundary-aware serialized output training for multi-talker asr
The recently proposed serialized output training (SOT) simplifies multi-talker automatic
speech recognition (ASR) by generating speaker transcriptions separated by a special …
speech recognition (ASR) by generating speaker transcriptions separated by a special …
A neural beamspace-domain filter for real-time multi-channel speech enhancement
Most deep-learning-based multi-channel speech enhancement methods focus on designing
a set of beamforming coefficients, to directly filter the low signal-to-noise ratio signals …
a set of beamforming coefficients, to directly filter the low signal-to-noise ratio signals …
Investigation of practical aspects of single channel speech separation for ASR
Speech separation has been successfully applied as a frontend processing module of
conversation transcription systems thanks to its ability to handle overlapped speech and its …
conversation transcription systems thanks to its ability to handle overlapped speech and its …
A separation and interaction framework for causal multi-channel speech enhancement
Multi-channel speech enhancement aims at extracting the desired speech using a
microphone array, which has many potential applications, such as video conferencing …
microphone array, which has many potential applications, such as video conferencing …
Streaming Multi-Channel Speech Separation with Online Time-Domain Generalized Wiener Filter
Y Luo - ICASSP 2023-2023 IEEE International Conference on …, 2023 - ieeexplore.ieee.org
Most existing streaming neural-network-based multi-channel speech separation systems
consist of a causal network architecture and an online spatial information extraction module …
consist of a causal network architecture and an online spatial information extraction module …