A comprehensive review of polyphonic sound event detection
One of the most amazing functions of the human auditory system is the ability to detect all
kinds of sound events in the environment. With the technologies and hardware advances …
kinds of sound events in the environment. With the technologies and hardware advances …
Unsupervised sound separation using mixture invariant training
In recent years, rapid progress has been made on the problem of single-channel sound
separation using supervised training of deep neural networks. In such supervised …
separation using supervised training of deep neural networks. In such supervised …
Sudo rm-rf: Efficient networks for universal audio source separation
In this paper, we present an efficient neural network for end-to-end general purpose audio
source separation. Specifically, the backbone structure of this convolutional network is the …
source separation. Specifically, the backbone structure of this convolutional network is the …
Far-field automatic speech recognition
The machine recognition of speech spoken at a distance from the microphones, known as
far-field automatic speech recognition (ASR), has received a significant increase in attention …
far-field automatic speech recognition (ASR), has received a significant increase in attention …
What's all the fuss about free universal sound separation data?
We introduce the Free Universal Sound Separation (FUSS) dataset, a new corpus for
experiments in separating mixtures of an unknown number of sounds from an open domain …
experiments in separating mixtures of an unknown number of sounds from an open domain …
Audioscopev2: Audio-visual attention architectures for calibrated open-domain on-screen sound separation
We introduce AudioScopeV2, a state-of-the-art universal audio-visual on-screen sound
separation system which is capable of learning to separate sounds and associate them with …
separation system which is capable of learning to separate sounds and associate them with …
Weakly-supervised audio-visual segmentation
S Mo, B Raj - Advances in Neural Information Processing …, 2024 - proceedings.neurips.cc
Audio-visual segmentation is a challenging task that aims to predict pixel-level masks for
sound sources in a video. Previous work applied a comprehensive manually designed …
sound sources in a video. Previous work applied a comprehensive manually designed …
Into the wild with audioscope: Unsupervised audio-visual separation of on-screen sounds
Recent progress in deep learning has enabled many advances in sound separation and
visual scene understanding. However, extracting sound sources which are apparent in …
visual scene understanding. However, extracting sound sources which are apparent in …
Separate what you describe: Language-queried audio source separation
In this paper, we introduce the task of language-queried audio source separation (LASS),
which aims to separate a target source from an audio mixture based on a natural language …
which aims to separate a target source from an audio mixture based on a natural language …
Move2hear: Active audio-visual source separation
S Majumder, Z Al-Halah… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
We introduce the active audio-visual source separation problem, where an agent must move
intelligently in order to better isolate the sounds coming from an object of interest in its …
intelligently in order to better isolate the sounds coming from an object of interest in its …