[HTML][HTML] Audio self-supervised learning: A survey
Similar to humans' cognitive ability to generalize knowledge and skills, self-supervised
learning (SSL) targets discovering general representations from large-scale data. This …
learning (SSL) targets discovering general representations from large-scale data. This …
[HTML][HTML] Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis
P Ochieng - Artificial Intelligence Review, 2023 - Springer
Deep neural networks (DNN) techniques have become pervasive in domains such as
natural language processing and computer vision. They have achieved great success in …
natural language processing and computer vision. They have achieved great success in …
Mix and localize: Localizing sound sources in mixtures
We present a method for simultaneously localizing multiple sound sources within a visual
scene. This task requires a model to both group a sound mixture into individual sources, and …
scene. This task requires a model to both group a sound mixture into individual sources, and …
What's all the fuss about free universal sound separation data?
We introduce the Free Universal Sound Separation (FUSS) dataset, a new corpus for
experiments in separating mixtures of an unknown number of sounds from an open domain …
experiments in separating mixtures of an unknown number of sounds from an open domain …
Remixit: Continual self-training of speech enhancement models via bootstrapped remixing
We present RemixIT, a simple yet effective self-supervised method for training speech
enhancement without the need of a single isolated in-domain speech nor a noise waveform …
enhancement without the need of a single isolated in-domain speech nor a noise waveform …
ESPnet-SE: End-to-end speech enhancement and separation toolkit designed for ASR integration
We present ESPnet-SE, which is designed for the quick development of speech
enhancement and speech separation systems in a single framework, along with the optional …
enhancement and speech separation systems in a single framework, along with the optional …
Audioscopev2: Audio-visual attention architectures for calibrated open-domain on-screen sound separation
We introduce AudioScopeV2, a state-of-the-art universal audio-visual on-screen sound
separation system which is capable of learning to separate sounds and associate them with …
separation system which is capable of learning to separate sounds and associate them with …
Domain‐specific neural networks improve automated bird sound recognition already with small amount of local data
P Lauha, P Somervuo, P Lehikoinen… - Methods in Ecology …, 2022 - Wiley Online Library
An automatic bird sound recognition system is a useful tool for collecting data of different
bird species for ecological analysis. Together with autonomous recording units (ARUs), such …
bird species for ecological analysis. Together with autonomous recording units (ARUs), such …
Into the wild with audioscope: Unsupervised audio-visual separation of on-screen sounds
Recent progress in deep learning has enabled many advances in sound separation and
visual scene understanding. However, extracting sound sources which are apparent in …
visual scene understanding. However, extracting sound sources which are apparent in …
Improving bird classification with unsupervised sound separation
This paper addresses the problem of species classification in bird song recordings. The
massive amount of available field recordings of birds presents an opportunity to use …
massive amount of available field recordings of birds presents an opportunity to use …