Decoupling magnitude and phase estimation with deep resunet for music source separation
Deep neural network based methods have been successfully applied to music source
separation. They typically learn a mapping from a mixture spectrogram to a set of source …
separation. They typically learn a mapping from a mixture spectrogram to a set of source …
DPT-FSNet: Dual-path transformer based full-band and sub-band fusion network for speech enhancement
Sub-band models have achieved promising results due to their ability to model local
patterns in the spectrogram. Some studies further improve the performance by fusing sub …
patterns in the spectrogram. Some studies further improve the performance by fusing sub …
Dccrn+: Channel-wise subband dccrn with snr estimation for speech enhancement
Deep complex convolution recurrent network (DCCRN), which extends CRN with complex
structure, has achieved superior performance in MOS evaluation in Interspeech 2020 deep …
structure, has achieved superior performance in MOS evaluation in Interspeech 2020 deep …
Separate what you describe: Language-queried audio source separation
In this paper, we introduce the task of language-queried audio source separation (LASS),
which aims to separate a target source from an audio mixture based on a natural language …
which aims to separate a target source from an audio mixture based on a natural language …
Tea-pse 2.0: Sub-band network for real-time personalized speech enhancement
Personalized speech enhancement (PSE) utilizes additional cues like speaker embeddings
to remove background noise and interfering speech and extract the speech from target …
to remove background noise and interfering speech and extract the speech from target …
VoiceFixer: Toward general speech restoration with neural vocoder
Speech restoration aims to remove distortions in speech signals. Prior methods mainly focus
on single-task speech restoration (SSR), such as speech denoising or speech declipping …
on single-task speech restoration (SSR), such as speech denoising or speech declipping …
The Sound Demixing Challenge 2023$\unicode {x2013} $ Music Demixing Track
This paper summarizes the music demixing (MDX) track of the Sound Demixing Challenge
(SDX'23). We provide a summary of the challenge setup and introduce the task of robust …
(SDX'23). We provide a summary of the challenge setup and introduce the task of robust …
CWS-PResUNet: Music source separation with channel-wise subband phase-aware resunet
Music source separation (MSS) shows active progress with deep learning models in recent
years. Many MSS models perform separations on spectrograms by estimating bounded ratio …
years. Many MSS models perform separations on spectrograms by estimating bounded ratio …
Target sound extraction with variable cross-modality clues
Automatic target sound extraction (TSE) is a machine learning approach to mimic the human
auditory perception capability of attending to a sound source of interest from a mixture of …
auditory perception capability of attending to a sound source of interest from a mixture of …
Surrey system for dcase 2022 task 5: Few-shot bioacoustic event detection with segment-level metric learning
Few-shot audio event detection is a task that detects the occurrence time of a novel sound
class given a few examples. In this work, we propose a system based on segment-level …
class given a few examples. In this work, we propose a system based on segment-level …