Adapting frechet audio distance for generative music evaluation
The growing popularity of generative music models underlines the need for perceptually
relevant, objective music quality metrics. The Frechet Audio Distance (FAD) is commonly …
relevant, objective music quality metrics. The Frechet Audio Distance (FAD) is commonly …
The Sound Demixing Challenge 2023$\unicode {x2013} $ Music Demixing Track
This paper summarizes the music demixing (MDX) track of the Sound Demixing Challenge
(SDX'23). We provide a summary of the challenge setup and introduce the task of robust …
(SDX'23). We provide a summary of the challenge setup and introduce the task of robust …
AudioSR: Versatile audio super-resolution at scale
Audio super-resolution is a fundamental task that predicts high-frequency components for
low-resolution audio, enhancing audio quality in digital applications. Previous methods have …
low-resolution audio, enhancing audio quality in digital applications. Previous methods have …
COCOLA: Coherence-Oriented Contrastive Learning of Musical Audio Representations
We present COCOLA (Coherence-Oriented Contrastive Learning for Audio), a contrastive
learning method for musical audio representations that captures the harmonic and rhythmic …
learning method for musical audio representations that captures the harmonic and rhythmic …
SCNet: Sparse Compression Network for Music Source Separation
Deep learning-based methods have made significant achievements in music source
separation. However, obtaining good results while maintaining a low model complexity …
separation. However, obtaining good results while maintaining a low model complexity …
The Cadenza ICASSP 2024 Grand Challenge
The Cadenza project aims to enhance the audio quality of music for individuals with hearing
loss. As part of this, the project is organizing the ICASSP SP Cadenza Challenge: Music …
loss. As part of this, the project is organizing the ICASSP SP Cadenza Challenge: Music …
SMITIN: Self-Monitored Inference-Time INtervention for Generative Music Transformers
We introduce Self-Monitored Inference-Time INtervention (SMITIN), an approach for
controlling an autoregressive generative music transformer using classifier probes. These …
controlling an autoregressive generative music transformer using classifier probes. These …
Ambisonizer: Neural Upmixing as Spherical Harmonics Generation
Y Zang, Y Wang, M Lee - arXiv preprint arXiv:2405.13428, 2024 - arxiv.org
Neural upmixing, the task of generating immersive music with an increased number of
channels from fewer input channels, has been an active research area, with mono-to-stereo …
channels from fewer input channels, has been an active research area, with mono-to-stereo …
A Stem-Agnostic Single-Decoder System for Music Source Separation Beyond Four Stems
KN Watcharasupat, A Lerch - arXiv preprint arXiv:2406.18747, 2024 - arxiv.org
Despite significant recent progress across multiple subtasks of audio source separation, few
music source separation systems support separation beyond the four-stem vocals, drums …
music source separation systems support separation beyond the four-stem vocals, drums …
Why does music source separation benefit from cacophony?
In music source separation, a standard training data augmentation procedure is to create
new training samples by randomly combining instrument stems from different songs. These …
new training samples by randomly combining instrument stems from different songs. These …