Toward universal text-to-music retrieval

SH Doh, M Won, K Choi, J Nam - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
This paper introduces effective design choices for text-to-music retrieval systems. An ideal
text-based retrieval system would support various input queries such as pre-defined tags …

Music Discovery Dialogue Generation Using Human Intent Analysis and Large Language Models

SH Doh, K Choi, D Kwon, T Kim, J Nam - arXiv preprint arXiv:2411.07439, 2024 - arxiv.org
A conversational music retrieval system can help users discover music that matches their
preferences through dialogue. To achieve this, a conversational music retrieval system …

Pitch-timbre disentanglement of musical instrument sounds based on VAE-based metric learning

K Tanaka, R Nishikimi, Y Bando… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
This paper describes a representation learning method for disentangling an arbitrary
musical instrument sound into latent pitch and timbre representations. Although such pitch …

Multi-modal, multi-task and multi-label for music genre classification and emotion regression

YR Pandeya, J You, B Bhattarai… - … on Information and …, 2021 - ieeexplore.ieee.org
A smart system is highly desirable with the capability to divide music into coarse and fine
categories based on emotion and genre. In this paper, we classify the music based on genre …

Multi-modal music information retrieval: Augmenting audio-analysis with visual computing for improved music video analysis

A Schindler - arXiv preprint arXiv:2002.00251, 2020 - arxiv.org
This thesis combines audio-analysis with computer vision to approach Music Information
Retrieval (MIR) tasks from a multi-modal perspective. This thesis focuses on the information …

Musical Word Embedding for Music Tagging and Retrieval

SH Doh, J Lee, D Jeong, J Nam - arXiv preprint arXiv:2404.13569, 2024 - arxiv.org
Word embedding has become an essential means for text-based information retrieval.
Typically, word embeddings are learned from large quantities of general and unstructured …

Musical word embedding: Bridging the gap between listening contexts and music

S Doh, J Lee, TH Park, J Nam - arXiv preprint arXiv:2008.01190, 2020 - arxiv.org
Word embedding pioneered by Mikolov et al. is a staple technique for word representations
in natural language processing (NLP) research which has also found popularity in music …

A Long-Tail Friendly Representation Framework for Artist and Music Similarity

H Xiang, J Dai, X Song, F Shen - arXiv preprint arXiv:2309.04182, 2023 - arxiv.org
The investigation of the similarity between artists and music is crucial in music retrieval and
recommendation, and addressing the challenge of the long-tail phenomenon is increasingly …

Multi-modal video forensic platform for investigating post-terrorist attack scenarios

A Schindler, A Lindley, A Jalali, M Boyer… - Proceedings of the 11th …, 2020 - dl.acm.org
The forensic investigation of a terrorist attack poses a significant challenge to the
investigative authorities, as often several thousand hours of video footage must be viewed …

A Graph-Based Relook Beyond Metadata for Music Recommendation Check for updates

V Bharadwaj, AS Mysore, N Sangli… - … and Computer Vision …, 2023 - books.google.com
Hyper personalization is permeating several domains to increase customer satisfaction and
achieve a more intimate user experience. However, popular music recommendation …