A comprehensive survey on deep music generation: Multi-level representations, algorithms, evaluations, and future directions

S Ji, J Luo, X Yang - arXiv preprint arXiv:2011.06801, 2020 - arxiv.org
The utilization of deep learning techniques in generating various contents (such as image,
text, etc.) has become a trend. Especially music, the topic of this paper, has attracted …

A novel estimator of mutual information for learning to disentangle textual representations

P Colombo, C Clavel, P Piantanida - arXiv preprint arXiv:2105.02685, 2021 - arxiv.org
Learning disentangled representations of textual data is essential for many natural language
tasks such as fair classification, style transfer and sentence generation, among others. The …

CDPAM: Contrastive learning for perceptual audio similarity

P Manocha, Z Jin, R Zhang… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
Many speech processing methods based on deep learning require an automatic and
differentiable audio metric for the loss function. The DPAM approach of Manocha et al.[1] …

Music fadernets: Controllable music generation based on high-level features via low-level feature modelling

HH Tan, D Herremans - arXiv preprint arXiv:2007.15474, 2020 - arxiv.org
High-level musical qualities (such as emotion) are often abstract, subjective, and hard to
quantify. Given these difficulties, it is not easy to learn good feature representations with …

Learning disentangled representations of timbre and pitch for musical instrument sounds using gaussian mixture variational autoencoders

YJ Luo, K Agres, D Herremans - arXiv preprint arXiv:1906.08152, 2019 - arxiv.org
In this paper, we learn disentangled representations of timbre and pitch for musical
instrument sounds. We adapt a framework based on variational autoencoders with Gaussian …

Musical composition style transfer via disentangled timbre representations

YN Hung, I Chiang, YA Chen, YH Yang - arXiv preprint arXiv …, 2019 - arxiv.org
Music creation involves not only composing the different parts (eg, melody, chords) of a
musical work but also arranging/selecting the instruments to play the different parts. While …

Deep generative models for musical audio synthesis

M Huzaifah, L Wyse - Handbook of artificial intelligence for music …, 2021 - Springer
Deep Generative Models for Musical Audio Synthesis | SpringerLink Skip to main content
Advertisement SpringerLink Account Menu Find a journal Publish with us Track your research …

MG-VAE: deep Chinese folk songs generation with specific regional styles

J Luo, X Yang, S Ji, J Li - Proceedings of the 7th Conference on Sound …, 2020 - Springer
Regional style in Chinese folk songs is a rich treasure that can be used for ethnic music
creation and folk culture research. In this paper, we propose MG-VAE, a music generative …

[PDF][PDF] Unsupervised Disentanglement of Pitch and Timbre for Isolated Musical Instrument Sounds.

YJ Luo, KW Cheuk, T Nakano, M Goto, D Herremans - ISMIR, 2020 - academia.edu
Disentangling factors of variation aims to uncover latent variables that underlie the process
of data generation. In this paper, we propose a framework that achieves unsupervised pitch …

Anti-transfer learning for task invariance in convolutional neural networks for speech processing

E Guizzo, T Weyde, G Tarroni - Neural Networks, 2021 - Elsevier
We introduce the novel concept of anti-transfer learning for speech processing with
convolutional neural networks. While transfer learning assumes that the learning process for …