EEND-SS: Joint end-to-end neural speaker diarization and speech separation for flexible number of speakers
In this paper, we present a novel framework that jointly performs three tasks: speaker
diarization, speech separation, and speaker counting. Our proposed framework integrates …
diarization, speech separation, and speaker counting. Our proposed framework integrates …
Speaker counting and separation from single-channel noisy mixtures
SR Chetupalli, EAP Habets - IEEE/ACM Transactions on Audio …, 2023 - ieeexplore.ieee.org
We address the problem of speaker counting and separation from a noisy, single-channel,
multi-source, recording. Most of the works in the literature assume mixtures containing two to …
multi-source, recording. Most of the works in the literature assume mixtures containing two to …
A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, and Extraction
We propose a multi-task universal speech enhancement (MUSE) model that can perform
five speech enhancement (SE) tasks: dereverberation, denoising, speech separation (SS) …
five speech enhancement (SE) tasks: dereverberation, denoising, speech separation (SS) …
Multi-microphone speaker separation by spatial regions
J Wechsler, SR Chetupalli, W Mack… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
We consider the task of region-based source separation of reverberant multi-microphone
recordings. We assume pre-defined spatial regions with a single active source per region …
recordings. We assume pre-defined spatial regions with a single active source per region …
Boosting Unknown-Number Speaker Separation with Transformer Decoder-Based Attractor
We propose a novel speech separation model designed to separate mixtures with an
unknown number of speakers. The proposed model stacks 1) a dual-path processing block …
unknown number of speakers. The proposed model stacks 1) a dual-path processing block …
Neural Fast Full-Rank Spatial Covariance Analysis for Blind Source Separation
This paper describes an efficient unsupervised learning method for a neural source
separation model that utilizes a probabilistic generative model of observed multichannel …
separation model that utilizes a probabilistic generative model of observed multichannel …
Target language extraction at multilingual cocktail parties
Typically, target speaker extraction seeks to extract a target speaker's contribution according
to his or her individual voice characteristics. In a “multilingual cocktail party” however …
to his or her individual voice characteristics. In a “multilingual cocktail party” however …
[PDF][PDF] Speech Separation for an Unknown Number of Speakers Using Transformers With Encoder-Decoder Attractors.
SR Chetupalli, EAP Habets - INTERSPEECH, 2022 - isca-archive.org
Speaker-independent speech separation for single-channel mixtures with an unknown
number of multiple speakers in the waveform domain is considered in this paper. To deal …
number of multiple speakers in the waveform domain is considered in this paper. To deal …
Tacnet: Temporal audio source counting network
A Ahmadnejad, AM Darviishani, MM Asadi… - arXiv preprint arXiv …, 2023 - arxiv.org
In this paper, we introduce the Temporal Audio Source Counting Network (TaCNet), an
innovative architecture that addresses limitations in audio source counting tasks. TaCNet …
innovative architecture that addresses limitations in audio source counting tasks. TaCNet …
Ultrasonic Through-Metal Communication Based on Deep-Learning-Assisted Echo Cancellation
J Zhang, M Jiang, J Zhang, M Gu, Z Cao - Sensors, 2024 - mdpi.com
Ultrasound is extremely efficient for wireless signal transmission through metal barriers due
to no limit of the Faraday shielding effect. Echoing in the ultrasonic channel is one of the …
to no limit of the Faraday shielding effect. Echoing in the ultrasonic channel is one of the …