Adapting frechet audio distance for generative music evaluation

A Gui, H Gamper, S Braun… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
The growing popularity of generative music models underlines the need for perceptually
relevant, objective music quality metrics. The Frechet Audio Distance (FAD) is commonly …

The Sound Demixing Challenge 2023$\unicode {x2013} $ Music Demixing Track

G Fabbro, S Uhlich, CH Lai, W Choi… - arXiv preprint arXiv …, 2023 - arxiv.org
This paper summarizes the music demixing (MDX) track of the Sound Demixing Challenge
(SDX'23). We provide a summary of the challenge setup and introduce the task of robust …

AudioSR: Versatile audio super-resolution at scale

H Liu, K Chen, Q Tian, W Wang… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
Audio super-resolution is a fundamental task that predicts high-frequency components for
low-resolution audio, enhancing audio quality in digital applications. Previous methods have …

COCOLA: Coherence-Oriented Contrastive Learning of Musical Audio Representations

R Ciranni, E Postolache, G Mariani, M Mancusi… - arXiv preprint arXiv …, 2024 - arxiv.org
We present COCOLA (Coherence-Oriented Contrastive Learning for Audio), a contrastive
learning method for musical audio representations that captures the harmonic and rhythmic …

SCNet: Sparse Compression Network for Music Source Separation

W Tong, J Zhu, J Chen, S Kang, T Jiang… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
Deep learning-based methods have made significant achievements in music source
separation. However, obtaining good results while maintaining a low model complexity …

The Cadenza ICASSP 2024 Grand Challenge

GR Dabike, MA Akeroyd, S Bannister, J Barker… - arXiv preprint arXiv …, 2023 - arxiv.org
The Cadenza project aims to enhance the audio quality of music for individuals with hearing
loss. As part of this, the project is organizing the ICASSP SP Cadenza Challenge: Music …

SMITIN: Self-Monitored Inference-Time INtervention for Generative Music Transformers

J Koo, G Wichern, FG Germain, S Khurana… - arXiv preprint arXiv …, 2024 - arxiv.org
We introduce Self-Monitored Inference-Time INtervention (SMITIN), an approach for
controlling an autoregressive generative music transformer using classifier probes. These …

Ambisonizer: Neural Upmixing as Spherical Harmonics Generation

Y Zang, Y Wang, M Lee - arXiv preprint arXiv:2405.13428, 2024 - arxiv.org
Neural upmixing, the task of generating immersive music with an increased number of
channels from fewer input channels, has been an active research area, with mono-to-stereo …

A Stem-Agnostic Single-Decoder System for Music Source Separation Beyond Four Stems

KN Watcharasupat, A Lerch - arXiv preprint arXiv:2406.18747, 2024 - arxiv.org
Despite significant recent progress across multiple subtasks of audio source separation, few
music source separation systems support separation beyond the four-stem vocals, drums …

Why does music source separation benefit from cacophony?

CB Jeon, G Wichern, FG Germain, JL Roux - arXiv preprint arXiv …, 2024 - arxiv.org
In music source separation, a standard training data augmentation procedure is to create
new training samples by randomly combining instrument stems from different songs. These …