Foundation models for music: A survey

Y Ma, A Øland, A Ragni, BMS Del Sette, C Saitis… - arXiv preprint arXiv …, 2024 - arxiv.org
In recent years, foundation models (FMs) such as large language models (LLMs) and latent
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …

Retrieval guided music captioning via multimodal prefixes

N Srivatsan, K Chen, S Dubnov… - Thirty-Third International …, 2023 - hal.science
In this paper we put forward a new approach to music captioning, the task of automatically
generating natural language descriptions for songs. These descriptions are useful both for …

[PDF][PDF] Leveraging Structure and Context for Language-Adjacent Representation Learning

N Srivatsan - 2024 - kilthub.cmu.edu
When learning representations from large corpora of language data, the overwhelming
strategy is to interpret that data as a collection of IID samples to be modeled in isolation from …