M4singer: A multi-style, multi-singer and musical score provided mandarin singing corpus
The lack of publicly available high-quality and accurately labeled datasets has long been a
major bottleneck for singing voice synthesis (SVS). To tackle this problem, we present …
major bottleneck for singing voice synthesis (SVS). To tackle this problem, we present …
Gibbsddrm: A partially collapsed gibbs sampler for solving blind inverse problems with denoising diffusion restoration
Pre-trained diffusion models have been successfully used as priors in a variety of linear
inverse problems, where the goal is to reconstruct a signal from noisy linear measurements …
inverse problems, where the goal is to reconstruct a signal from noisy linear measurements …
Opencpop: A high-quality open source chinese popular song corpus for singing voice synthesis
This paper introduces Opencpop, a publicly available high-quality Mandarin singing corpus
designed for singing voice synthesis (SVS). The corpus consists of 100 popular Mandarin …
designed for singing voice synthesis (SVS). The corpus consists of 100 popular Mandarin …
The singing voice conversion challenge 2023
We present the latest iteration of the voice conversion challenge (VCC) series, a bi-annual
scientific event aiming to compare and understand different voice conversion (VC) systems …
scientific event aiming to compare and understand different voice conversion (VC) systems …
Globally, songs and instrumental melodies are slower and higher and use more stable pitches than speech: A Registered Report
Both music and language are found in all known human societies, yet no studies have
compared similarities and differences between song, speech, and instrumental music on a …
compared similarities and differences between song, speech, and instrumental music on a …
Unsupervised vocal dereverberation with diffusion-based generative models
Removing reverb from reverberant music is a necessary technique to clean up audio for
downstream music manipulations. Reverberation of music contains two categories, natural …
downstream music manipulations. Reverberation of music contains two categories, natural …
Singing voice data scaling-up: An introduction to ace-opencpop and kising-v2
In singing voice synthesis (SVS), generating singing voices from musical scores faces
challenges due to limited data availability, a constraint less common in text-to-speech (TTS) …
challenges due to limited data availability, a constraint less common in text-to-speech (TTS) …
Hierarchical diffusion models for singing voice neural vocoder
Recent progress in deep generative models has improved the quality of neural vocoders in
speech domain. However, generating a high-quality singing voice remains challenging due …
speech domain. However, generating a high-quality singing voice remains challenging due …
Deep learning approaches in topics of singing information processing
Singing, the vocal productionof musical tones, is one of the most important elements of
music. Addressing the needs of real-world applications, the study of technologies related to …
music. Addressing the needs of real-world applications, the study of technologies related to …
Globally, songs and instrumental melodies are slower, higher, and use more stable pitches than speech [Stage 2 Registered Report]
What, if any, similarities and differences between music and speech are consistent across
cultures? Both music and language are found in all known human societies and are argued …
cultures? Both music and language are found in all known human societies and are argued …