Separate anything you describe
Language-queried audio source separation (LASS) is a new paradigm for computational
auditory scene analysis (CASA). LASS aims to separate a target sound from an audio …
auditory scene analysis (CASA). LASS aims to separate a target sound from an audio …
Voicefixer: A unified framework for high-fidelity speech restoration
Speech restoration aims to remove distortions in speech signals. Prior methods mainly focus
on a single type of distortion, such as speech denoising or dereverberation. However …
on a single type of distortion, such as speech denoising or dereverberation. However …
Birdsoundsdenoising: Deep visual audio denoising for bird sounds
Audio denoising has been explored for decades using both traditional and deep learning-
based methods. However, these methods are still limited to either manually added artificial …
based methods. However, these methods are still limited to either manually added artificial …
Zero-shot audio source separation through query-based learning from weakly-labeled data
Deep learning techniques for separating audio into different sound sources face several
challenges. Standard architectures require training separate models for different types of …
challenges. Standard architectures require training separate models for different types of …
Deeplabv3+ vision transformer for visual bird sound denoising
J Li, P Wang, Y Zhang - IEEE Access, 2023 - ieeexplore.ieee.org
Audio denoising is a task to improve the perceptual quality of noisy audio signals. There is
still residual noise after the denoising of noisy signals, which will affect the quality of audio …
still residual noise after the denoising of noisy signals, which will affect the quality of audio …
Separate but together: Unsupervised federated learning for speech enhancement from non-iid data
We propose FedEnhance, an unsupervised federated learning (FL) approach for speech
enhancement and separation with non-IID distributed data across multiple clients. We …
enhancement and separation with non-IID distributed data across multiple clients. We …
Complex image generation swintransformer network for audio denoising
Achieving high-performance audio denoising is still a challenging task in real-world
applications. Existing time-frequency methods often ignore the quality of generated …
applications. Existing time-frequency methods often ignore the quality of generated …
Surrey system for dcase 2022 task 5: Few-shot bioacoustic event detection with segment-level metric learning
Few-shot audio event detection is a task that detects the occurrence time of a novel sound
class given a few examples. In this work, we propose a system based on segment-level …
class given a few examples. In this work, we propose a system based on segment-level …
[PDF][PDF] Background-aware Modeling for Weakly Supervised Sound Event Detection
Nowadays, a common framework for weakly supervised sound event detection (WSSED) is
multiple instance learning (MIL). However, MIL directly optimizes the clip-level classification …
multiple instance learning (MIL). However, MIL directly optimizes the clip-level classification …
Segment-level metric learning for few-shot bioacoustic event detection
Few-shot bioacoustic event detection is a task that detects the occurrence time of a novel
sound given a few examples. Previous methods employ metric learning to build a latent …
sound given a few examples. Previous methods employ metric learning to build a latent …