Annotator Subjectivity in the MusicCaps Dataset.- 学术资源搜索

[PDF][PDF] Annotator Subjectivity in the MusicCaps Dataset.

M Lee, S Doh, D Jeong - HCMIR@ ISMIR, 2023 - ceur-ws.org

HCMIR@ ISMIR, 2023•ceur-ws.org

Abstract

Musical caption, when expressed in free-form text as opposed to more structured and limited musical tags, often encompasses the individual characteristics of the annotator, thereby injecting a degree of subjectivity into the resultant dataset. This study explores the impact of such annotator subjectivity within the MusicCaps dataset, a pioneering collection of human-annotated captions explaining 10-second music audio clips. We conducted three distinct analyzes to investigate the presence of this subjectivity. This includes examining the frequency distribution of tag categories (ie, genre, mood, or instruments) among different annotators, a qualitative assessment of caption embeddings through UMAP visualizations, and a quantitative analysis where we train and compare cross-modal retrieval models using an annotator-specified training split. Our findings underscore the significant annotator subjectivity inherent in the MusicCaps dataset, emphasizing the need for its consideration when collecting free-form text annotations on music or developing machine-learning models using this type of dataset.

ceur-ws.org

展开收起

被引用次数：3 相关文章

以上显示的是最相近的搜索结果。查看全部搜索结果