Birds, bats and beyond: Evaluating generalization in bioacoustics models

B van Merriënboer, J Hamer, V Dumoulin… - Frontiers in Bird …, 2024 - frontiersin.org
In the context of passive acoustic monitoring (PAM) better models are needed to reliably
gain insights from large amounts of raw, unlabeled data. Bioacoustics foundation models …

Semi-supervsied Learning-based Sound Event Detection using Freuqency Dynamic Convolution with Large Kernel Attention for DCASE Challenge 2023 Task 4

JW Kim, SW Son, Y Song, HK Kim, IH Song… - arXiv preprint arXiv …, 2023 - arxiv.org
This report proposes a frequency dynamic convolution (FDY) with a large kernel attention
(LKA)-convolutional recurrent neural network (CRNN) with a pre-trained bidirectional …

Why do angular margin losses work well for semi-supervised anomalous sound detection?

K Wilkinghoff, F Kurth - IEEE/ACM Transactions on Audio …, 2023 - ieeexplore.ieee.org
State-of-the-art anomalous sound detection systems often utilize angular margin losses to
learn suitable representations of acoustic data using an auxiliary task, which usually is a …

[PDF][PDF] FMSG submission for DCASE 2023 challenge task 4 on sound event detection with weak labels and synthetic soundscapes

Y Xiao, T Khandelwal, RK Das - Proc. DCASE Challenge, 2023 - dcase.community
This report presents the systems developed and submitted by Fortemedia Singapore
(FMSG) for DCASE 2023 Task 4A, which focuses on sound event detection with weak labels …

[PDF][PDF] Li USTC team's submission for DCASE 2023 challenge task4a

K Li, P Cai, Y Song - Tech. Rep., DCASE2023 Challenge, 2023 - dcase.community
In this technical report, we present our submissions for DCASE 2023 challenge task4a. We
mainly study how to fine-tune patchout fast spectrogram transformer (PaSST) for sound …

Fine-tune the pretrained atst model for sound event detection

N Shao, X Li, X Li - ICASSP 2024-2024 IEEE International …, 2024 - ieeexplore.ieee.org
Sound event detection (SED) often suffers from the data deficiency problem. Recent SED
systems leverage the large pretrained self-supervised learning (SelfSL) models to mitigate …

Post-processing independent evaluation of sound event detection systems

J Ebbers, R Haeb-Umbach, R Serizel - arXiv preprint arXiv:2306.15440, 2023 - arxiv.org
Due to the high variation in the application requirements of sound event detection (SED)
systems, it is not sufficient to evaluate systems only in a single operating mode. Therefore …

Sound Event Bounding Boxes

J Ebbers, FG Germain, G Wichern, JL Roux - arXiv preprint arXiv …, 2024 - arxiv.org
Sound event detection is the task of recognizing sounds and determining their extent
(onset/offset times) within an audio clip. Existing systems commonly predict sound presence …

Towards Weakly Supervised Text-to-Audio Grounding

X Xu, Z Ma, M Wu, K Yu - arXiv preprint arXiv:2401.02584, 2024 - arxiv.org
Text-to-audio grounding (TAG) task aims to predict the onsets and offsets of sound events
described by natural language. This task can facilitate applications such as multimodal …

[PDF][PDF] Sound Event Detection: A Journey Through DCASE Challenge Series

T Khandelwal, RK Das, ES Chng - APSIPA Transactions on …, 2024 - nowpublishers.com
The sense of hearing is fundamental to human beings, as it allows them to perceive their
surroundings. However, this simple task of recognizing different sounds in complex …