A comprehensive survey on deep music generation: Multi-level representations, algorithms, evaluations, and future directions
S Ji, J Luo, X Yang - arXiv preprint arXiv:2011.06801, 2020 - arxiv.org
The utilization of deep learning techniques in generating various contents (such as image,
text, etc.) has become a trend. Especially music, the topic of this paper, has attracted …
text, etc.) has become a trend. Especially music, the topic of this paper, has attracted …
Foundation models for music: A survey
In recent years, foundation models (FMs) such as large language models (LLMs) and latent
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …
Musicbert: Symbolic music understanding with large-scale pre-training
Symbolic music understanding, which refers to the understanding of music from the symbolic
data (eg, MIDI format, but not audio), covers many music applications such as genre …
data (eg, MIDI format, but not audio), covers many music applications such as genre …
MT3: Multi-task multitrack music transcription
Automatic Music Transcription (AMT), inferring musical notes from raw audio, is a
challenging task at the core of music understanding. Unlike Automatic Speech Recognition …
challenging task at the core of music understanding. Unlike Automatic Speech Recognition …
EMOPIA: A multi-modal pop piano dataset for emotion recognition and emotion-based music generation
While there are many music datasets with emotion labels in the literature, they cannot be
used for research on symbolic-domain music analysis or generation, as there are usually …
used for research on symbolic-domain music analysis or generation, as there are usually …
Sequence-to-sequence piano transcription with transformers
Automatic Music Transcription has seen significant progress in recent years by training
custom deep neural networks on large datasets. However, these models have required …
custom deep neural networks on large datasets. However, these models have required …
Giantmidi-piano: A large-scale midi dataset for classical piano music
Symbolic music datasets are important for music information retrieval and musical analysis.
However, there is a lack of large-scale symbolic datasets for classical piano music. In this …
However, there is a lack of large-scale symbolic datasets for classical piano music. In this …
MidiBERT-piano: large-scale pre-training for symbolic music understanding
This paper presents an attempt to employ the mask language modeling approach of BERT
to pre-train a 12-layer Transformer model over 4,166 pieces of polyphonic piano MIDI files …
to pre-train a 12-layer Transformer model over 4,166 pieces of polyphonic piano MIDI files …
Automatic piano transcription with hierarchical frequency-time transformer
Taking long-term spectral and temporal dependencies into account is essential for automatic
piano transcription. This is especially helpful when determining the precise onset and offset …
piano transcription. This is especially helpful when determining the precise onset and offset …
Clamp: Contrastive language-music pre-training for cross-modal symbolic music information retrieval
We introduce CLaMP: Contrastive Language-Music Pre-training, which learns cross-modal
representations between natural language and symbolic music using a music encoder and …
representations between natural language and symbolic music using a music encoder and …