Deepsinger: Singing voice synthesis with data mined from the web

Y Ren, X Tan, T Qin, J Luan, Z Zhao… - Proceedings of the 26th …, 2020 - dl.acm.org
In this paper, we develop DeepSinger, a multi-lingual multi-singer singing voice synthesis
(SVS) system, which is built from scratch using singing training data mined from music …

End-to-end lyrics alignment for polyphonic music using an audio-to-character recognition model

D Stoller, S Durand, S Ewert - ICASSP 2019-2019 IEEE …, 2019 - ieeexplore.ieee.org
Time-aligned lyrics can enrich the music listening experience by enabling karaoke, text-
based song retrieval and intra-song navigation, and other applications. Compared to text-to …

Automatic lyrics transcription of polyphonic music with lyrics-chord multi-task learning

X Gao, C Gupta, H Li - IEEE/ACM Transactions on Audio …, 2022 - ieeexplore.ieee.org
Lyrics are the words that make up a song, while chords are harmonic sets of multiple notes
in music. Lyrics and chords are generally essential information in music, ie unaccompanied …

NHSS: A speech and singing parallel database

B Sharma, X Gao, K Vijayan, X Tian, H Li - Speech Communication, 2021 - Elsevier
We present a database of parallel recordings of speech and singing, collected and released
by the Human Language Technology (HLT) laboratory at the National University of …

Automatic lyrics transcription using dilated convolutional neural networks with self-attention

E Demirel, S Ahlbäck, S Dixon - 2020 International Joint …, 2020 - ieeexplore.ieee.org
Speech recognition is a well developed research field so that the current state of the art
systems are being used in many applications in the software industry, yet as by today, there …

Genre-conditioned acoustic models for automatic lyrics transcription of polyphonic music

X Gao, C Gupta, H Li - ICASSP 2022-2022 IEEE International …, 2022 - ieeexplore.ieee.org
Lyrics transcription of polyphonic music is challenging not only because the singing vocals
are corrupted by the background music, but also because the background music and the …

Deep learning approaches in topics of singing information processing

C Gupta, H Li, M Goto - IEEE/ACM Transactions on Audio …, 2022 - ieeexplore.ieee.org
Singing, the vocal productionof musical tones, is one of the most important elements of
music. Addressing the needs of real-world applications, the study of technologies related to …

MSTRE-Net: Multistreaming acoustic modeling for automatic lyrics transcription

E Demirel, S Ahlbäck, S Dixon - arXiv preprint arXiv:2108.02625, 2021 - arxiv.org
This paper makes several contributions to automatic lyrics transcription (ALT) research. Our
main contribution is a novel variant of the Multistreaming Time-Delay Neural Network …

Susing: Su-net for singing voice synthesis

X Zhang, J Wang, N Cheng… - 2022 International Joint …, 2022 - ieeexplore.ieee.org
Singing voice synthesis is a generative task that involves multi-dimensional control of the
singing model, including lyrics, pitch, and duration, and includes the timbre of the singer and …

Automatic lyrics-to-audio alignment on polyphonic music using singing-adapted acoustic models

B Sharma, C Gupta, H Li, Y Wang - ICASSP 2019-2019 IEEE …, 2019 - ieeexplore.ieee.org
Lyrics-to-audio alignment is to automatically align the lyrical words with the mixed singing
audio (singing voice+ musical accompaniment). Such alignment can be achieved with an …