Diffsinger: Singing voice synthesis via shallow diffusion mechanism

J Liu, C Li, Y Ren, F Chen, Z Zhao - … of the AAAI conference on artificial …, 2022 - ojs.aaai.org
Singing voice synthesis (SVS) systems are built to synthesize high-quality and expressive
singing voice, in which the acoustic model generates the acoustic features (eg, mel …

M4singer: A multi-style, multi-singer and musical score provided mandarin singing corpus

L Zhang, R Li, S Wang, L Deng, J Liu… - Advances in …, 2022 - proceedings.neurips.cc
The lack of publicly available high-quality and accurately labeled datasets has long been a
major bottleneck for singing voice synthesis (SVS). To tackle this problem, we present …

Multi-singer: Fast multi-singer singing voice vocoder with a large-scale corpus

R Huang, F Chen, Y Ren, J Liu, C Cui… - Proceedings of the 29th …, 2021 - dl.acm.org
High-fidelity multi-singer singing voice synthesis is challenging for neural vocoder due to the
singing voice data shortage, limited singer generalization, and large computational cost …

Popmag: Pop music accompaniment generation

Y Ren, J He, X Tan, T Qin, Z Zhao, TY Liu - Proceedings of the 28th ACM …, 2020 - dl.acm.org
In pop music, accompaniments are usually played by multiple instruments (tracks) such as
drum, bass, string and guitar, and can make a song more expressive and contagious by …

Visinger: Variational inference with adversarial learning for end-to-end singing voice synthesis

Y Zhang, J Cong, H Xue, L Xie… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
In this paper, we propose VISinger, a complete end-to-end high-quality singing voice
synthesis (SVS) system that directly generates singing audio from lyrics and musical score …

Hifisinger: Towards high-fidelity neural singing voice synthesis

J Chen, X Tan, J Luan, T Qin, TY Liu - arXiv preprint arXiv:2009.01776, 2020 - arxiv.org
High-fidelity singing voices usually require higher sampling rate (eg, 48kHz) to convey
expression and emotion. However, higher sampling rate causes the wider frequency band …

A tutorial on AI music composition

X Tan, X Li - Proceedings of the 29th ACM international conference …, 2021 - dl.acm.org
AI music composition is one of the most attractive and important topics in artificial
intelligence, music, and multimedia. The typical tasks in AI music composition include …

Sinsy: A deep neural network-based singing voice synthesis system

Y Hono, K Hashimoto, K Oura… - … on Audio, Speech …, 2021 - ieeexplore.ieee.org
This paper presents Sinsy, a deep neural network (DNN)-based singing voice synthesis
(SVS) system. In recent years, DNNs have been utilized in statistical parametric SVS …

DDSP-based singing vocoders: A new subtractive-based synthesizer and a comprehensive evaluation

DY Wu, WY Hsiao, FR Yang, O Friedman… - arXiv preprint arXiv …, 2022 - arxiv.org
A vocoder is a conditional audio generation model that converts acoustic features such as
mel-spectrograms into waveforms. Taking inspiration from Differentiable Digital Signal …

Musicagent: An ai agent for music understanding and generation with large language models

D Yu, K Song, P Lu, T He, X Tan, W Ye, S Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org
AI-empowered music processing is a diverse field that encompasses dozens of tasks,
ranging from generation tasks (eg, timbre synthesis) to comprehension tasks (eg, music …