Birds, bats and beyond: Evaluating generalization in bioacoustics models
In the context of passive acoustic monitoring (PAM) better models are needed to reliably
gain insights from large amounts of raw, unlabeled data. Bioacoustics foundation models …
gain insights from large amounts of raw, unlabeled data. Bioacoustics foundation models …
Semi-supervsied Learning-based Sound Event Detection using Freuqency Dynamic Convolution with Large Kernel Attention for DCASE Challenge 2023 Task 4
This report proposes a frequency dynamic convolution (FDY) with a large kernel attention
(LKA)-convolutional recurrent neural network (CRNN) with a pre-trained bidirectional …
(LKA)-convolutional recurrent neural network (CRNN) with a pre-trained bidirectional …
Why do angular margin losses work well for semi-supervised anomalous sound detection?
K Wilkinghoff, F Kurth - IEEE/ACM Transactions on Audio …, 2023 - ieeexplore.ieee.org
State-of-the-art anomalous sound detection systems often utilize angular margin losses to
learn suitable representations of acoustic data using an auxiliary task, which usually is a …
learn suitable representations of acoustic data using an auxiliary task, which usually is a …
[PDF][PDF] FMSG submission for DCASE 2023 challenge task 4 on sound event detection with weak labels and synthetic soundscapes
This report presents the systems developed and submitted by Fortemedia Singapore
(FMSG) for DCASE 2023 Task 4A, which focuses on sound event detection with weak labels …
(FMSG) for DCASE 2023 Task 4A, which focuses on sound event detection with weak labels …
[PDF][PDF] Li USTC team's submission for DCASE 2023 challenge task4a
K Li, P Cai, Y Song - Tech. Rep., DCASE2023 Challenge, 2023 - dcase.community
In this technical report, we present our submissions for DCASE 2023 challenge task4a. We
mainly study how to fine-tune patchout fast spectrogram transformer (PaSST) for sound …
mainly study how to fine-tune patchout fast spectrogram transformer (PaSST) for sound …
Fine-tune the pretrained atst model for sound event detection
Sound event detection (SED) often suffers from the data deficiency problem. Recent SED
systems leverage the large pretrained self-supervised learning (SelfSL) models to mitigate …
systems leverage the large pretrained self-supervised learning (SelfSL) models to mitigate …
Post-processing independent evaluation of sound event detection systems
Due to the high variation in the application requirements of sound event detection (SED)
systems, it is not sufficient to evaluate systems only in a single operating mode. Therefore …
systems, it is not sufficient to evaluate systems only in a single operating mode. Therefore …
Sound Event Bounding Boxes
Sound event detection is the task of recognizing sounds and determining their extent
(onset/offset times) within an audio clip. Existing systems commonly predict sound presence …
(onset/offset times) within an audio clip. Existing systems commonly predict sound presence …
Towards Weakly Supervised Text-to-Audio Grounding
Text-to-audio grounding (TAG) task aims to predict the onsets and offsets of sound events
described by natural language. This task can facilitate applications such as multimodal …
described by natural language. This task can facilitate applications such as multimodal …
[PDF][PDF] Sound Event Detection: A Journey Through DCASE Challenge Series
The sense of hearing is fundamental to human beings, as it allows them to perceive their
surroundings. However, this simple task of recognizing different sounds in complex …
surroundings. However, this simple task of recognizing different sounds in complex …