Song describer: a platform for collecting textual descriptions of music recordings

I Manco, B Weck, S Doh, M Won, Y Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org

We introduce the Song Describer dataset (SDD), a new crowdsourced corpus of high-quality
audio-caption pairs, designed for the evaluation of music-and-language models. The …

被引用次数：22 相关文章所有 5 个版本

[PDF] arxiv.org

Lp-musiccaps: Llm-based pseudo music captioning

SH Doh, K Choi, J Lee, J Nam - arXiv preprint arXiv:2307.16372, 2023 - arxiv.org

Automatic music captioning, which generates natural language descriptions for given music
tracks, holds significant potential for enhancing the understanding and organization of large …

被引用次数：56 相关文章所有 6 个版本

[PDF] thecvf.com

Musechat: A conversational music recommendation system for videos

Z Dong, X Liu, B Chen, P Polak… - Proceedings of the …, 2024 - openaccess.thecvf.com

Music recommendation for videos attracts growing interest in multi-modal research.
However existing systems focus primarily on content compatibility often ignoring the users' …

被引用次数：18 相关文章所有 4 个版本

[PDF] arxiv.org

Wikimute: A web-sourced dataset of semantic descriptions for music audio

B Weck, H Kirchhoff, P Grosche, X Serra - International Conference on …, 2024 - Springer

Multi-modal deep learning techniques for matching free-form text with music have shown
promising results in the field of Music Information Retrieval (MIR). Prior work is often based …

被引用次数：2 相关文章所有 4 个版本

[PDF] ceur-ws.org

[PDF][PDF] Annotator Subjectivity in the MusicCaps Dataset.

M Lee, S Doh, D Jeong - HCMIR@ ISMIR, 2023 - ceur-ws.org

Musical caption, when expressed in free-form text as opposed to more structured and limited
musical tags, often encompasses the individual characteristics of the annotator, thereby …

被引用次数：3 相关文章

[PDF] hal.science

Zero-Shot Structure Labeling with Audio And Language Model Embeddings

M Buisson, C Ick, T Xi, B McFee - … Abstracts for the Late-Breaking Demo …, 2024 - hal.science

Recent progress on audio-based music structure analysis has closely aligned with the
appearance of new deep learning paradigms, notably for the extraction of robust spectro …