FMSG-JLESS Submission for DCASE 2024 Task4 on Sound Event Detection with Heterogeneous Training Dataset and Potentially Missing Labels

Y Xiao, H Yin, J Bai, RK Das - arXiv preprint arXiv:2407.00291, 2024 - arxiv.org
This report presents the systems developed and submitted by Fortemedia Singapore
(FMSG) and Joint Laboratory of Environmental Sound Sensing (JLESS) for DCASE 2024 …

Improving audio spectrogram transformers for sound event detection through multi-stage training

F Schmid, P Primus, T Morocutti, J Greif… - arXiv preprint arXiv …, 2024 - arxiv.org
This technical report describes the CP-JKU team's submission for Task 4 Sound Event
Detection with Heterogeneous Training Datasets and Potentially Missing Labels of the …

Mixstyle based Domain Generalization for Sound Event Detection with Heterogeneous Training Data

Y Xiao, H Yin, J Bai, RK Das - arXiv preprint arXiv:2407.03654, 2024 - arxiv.org
This work explores domain generalization (DG) for sound event detection (SED), advancing
adaptability towards real-world scenarios. Our approach employs a mean-teacher …

Multi-Iteration Multi-Stage Fine-Tuning of Transformers for Sound Event Detection with Heterogeneous Datasets

F Schmid, P Primus, T Morocutti, J Greif… - arXiv preprint arXiv …, 2024 - arxiv.org
A central problem in building effective sound event detection systems is the lack of high-
quality, strongly annotated sound event datasets. For this reason, Task 4 of the DCASE 2024 …

Prototype based Masked Audio Model for Self-Supervised Learning of Sound Event Detection

P Cai, Y Song, N Jiang, Q Gu, I McLoughlin - arXiv preprint arXiv …, 2024 - arxiv.org
A significant challenge in sound event detection (SED) is the effective utilization of
unlabeled data, given the limited availability of labeled data due to high annotation costs …

Self Training and Ensembling Frequency Dependent Networks with Coarse Prediction Pooling and Sound Event Bounding Boxes

H Nam, D Min, S Choi, I Choi, YH Park - arXiv preprint arXiv:2406.15725, 2024 - arxiv.org
To tackle sound event detection (SED), we propose frequency dependent networks
(FreDNets), which heavily leverage frequency-dependent methods. We apply frequency …

Pushing the Limit of Sound Event Detection with Multi-Dilated Frequency Dynamic Convolution

H Nam, YH Park - arXiv preprint arXiv:2406.13312, 2024 - arxiv.org
Frequency dynamic convolution (FDY conv) has been a milestone in the sound event
detection (SED) field, but it involves a substantial increase in model size due to multiple …

[PDF][PDF] TECHNICAL REPORT ON LEE SUBMISSION: SOUND EVENT DETECTION USING CONFORMER AND ATST FRAMEWORK FOR DCASE CHALLENGE 2024 …

Y Lee, JH Jung - dcase.community
ABSTRACT Sound Event Detection (SED) has shown promising performance in detecting
and classifying meaningful events on the given audio signal input. Since the real-world …