Separate anything you describe

X Liu, Q Kong, Y Zhao, H Liu, Y Yuan… - … on Audio, Speech …, 2024 - ieeexplore.ieee.org
Language-queried audio source separation (LASS) is a new paradigm for computational
auditory scene analysis (CASA). LASS aims to separate a target sound from an audio …

Voicefixer: A unified framework for high-fidelity speech restoration

H Liu, X Liu, Q Kong, Q Tian, Y Zhao, DL Wang… - arXiv preprint arXiv …, 2022 - arxiv.org
Speech restoration aims to remove distortions in speech signals. Prior methods mainly focus
on a single type of distortion, such as speech denoising or dereverberation. However …

Birdsoundsdenoising: Deep visual audio denoising for bird sounds

Y Zhang, J Li - Proceedings of the IEEE/CVF Winter …, 2023 - openaccess.thecvf.com
Audio denoising has been explored for decades using both traditional and deep learning-
based methods. However, these methods are still limited to either manually added artificial …

Zero-shot audio source separation through query-based learning from weakly-labeled data

K Chen, X Du, B Zhu, Z Ma, T Berg-Kirkpatrick… - Proceedings of the …, 2022 - ojs.aaai.org
Deep learning techniques for separating audio into different sound sources face several
challenges. Standard architectures require training separate models for different types of …

Deeplabv3+ vision transformer for visual bird sound denoising

J Li, P Wang, Y Zhang - IEEE Access, 2023 - ieeexplore.ieee.org
Audio denoising is a task to improve the perceptual quality of noisy audio signals. There is
still residual noise after the denoising of noisy signals, which will affect the quality of audio …

Separate but together: Unsupervised federated learning for speech enhancement from non-iid data

E Tzinis, J Casebeer, Z Wang… - 2021 IEEE Workshop …, 2021 - ieeexplore.ieee.org
We propose FedEnhance, an unsupervised federated learning (FL) approach for speech
enhancement and separation with non-IID distributed data across multiple clients. We …

Complex image generation swintransformer network for audio denoising

Y Zhang, J Li - arXiv preprint arXiv:2310.16109, 2023 - arxiv.org
Achieving high-performance audio denoising is still a challenging task in real-world
applications. Existing time-frequency methods often ignore the quality of generated …

Surrey system for dcase 2022 task 5: Few-shot bioacoustic event detection with segment-level metric learning

H Liu, X Liu, X Mei, Q Kong, W Wang… - arXiv preprint arXiv …, 2022 - arxiv.org
Few-shot audio event detection is a task that detects the occurrence time of a novel sound
class given a few examples. In this work, we propose a system based on segment-level …

[PDF][PDF] Background-aware Modeling for Weakly Supervised Sound Event Detection

Y Xin, D Yang, Y Zou - Proc. INTERSPEECH, 2023 - isca-archive.org
Nowadays, a common framework for weakly supervised sound event detection (WSSED) is
multiple instance learning (MIL). However, MIL directly optimizes the clip-level classification …

Segment-level metric learning for few-shot bioacoustic event detection

H Liu, X Liu, X Mei, Q Kong, W Wang… - arXiv preprint arXiv …, 2022 - arxiv.org
Few-shot bioacoustic event detection is a task that detects the occurrence time of a novel
sound given a few examples. Previous methods employ metric learning to build a latent …