Singing voice data scaling-up: An introduction to ace-opencpop and kising-v2

Y Zhang, Y Zang, J Shi, R Yamamoto, T Toda… - arXiv preprint arXiv …, 2024 - arxiv.org

With the advancements in singing voice generation and the growing presence of AI singers
on media platforms, the inaugural Singing Voice Deepfake Detection (SVDD) Challenge …

被引用次数：1 相关文章所有 2 个版本

[HTML] mdpi.com

[HTML][HTML] Video and audio deepfake datasets and open issues in deepfake technology: being ahead of the curve

Z Akhtar, TL Pendyala, VS Athmakuri - Forensic Sciences, 2024 - mdpi.com

The revolutionary breakthroughs in Machine Learning (ML) and Artificial Intelligence (AI) are
extensively being harnessed across a diverse range of domains, eg, forensic science …

被引用次数：1 相关文章

[PDF] arxiv.org

SingOMD: Singing Oriented Multi-resolution Discrete Representation Construction from Speech Models

Y Tang, Y Wu, J Shi, Q Jin - arXiv preprint arXiv:2406.08905, 2024 - arxiv.org

Discrete representation has shown advantages in speech generation tasks, wherein
discrete tokens are derived by discretizing hidden features from self-supervised learning …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection

Y Zang, J Shi, Y Zhang, R Yamamoto, J Han… - arXiv preprint arXiv …, 2024 - arxiv.org

Recent singing voice synthesis and conversion advancements necessitate robust singing
voice deepfake detection (SVDD) models. Current SVDD datasets face challenges due to …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

TokSing: Singing Voice Synthesis based on Discrete Tokens

Y Wu, J Shi, Y Tang, S Yang, Q Jin - arXiv preprint arXiv:2406.08416, 2024 - arxiv.org

Recent advancements in speech synthesis witness significant benefits by leveraging
discrete tokens extracted from self-supervised learning (SSL) models. Discrete tokens offer …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

VISinger2+: End-to-End Singing Voice Synthesis Augmented by Self-Supervised Learning Representation

Y Yu, J Shi, Y Wu, S Watanabe - arXiv preprint arXiv:2406.08761, 2024 - arxiv.org

Singing Voice Synthesis (SVS) has witnessed significant advancements with the advent of
deep learning techniques. However, a significant challenge in SVS is the scarcity of labeled …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

SVDD Challenge 2024: A Singing Voice Deepfake Detection Challenge Evaluation Plan

Y Zhang, Y Zang, J Shi, R Yamamoto, J Han… - arXiv preprint arXiv …, 2024 - arxiv.org

The rapid advancement of AI-generated singing voices, which now closely mimic natural
human singing and align seamlessly with musical scores, has led to heightened concerns …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

SingMOS: An extensive Open-Source Singing Voice Dataset for MOS Prediction

Y Tang, J Shi, Y Wu, Q Jin - arXiv preprint arXiv:2406.10911, 2024 - arxiv.org

In speech generation tasks, human subjective ratings, usually referred to as the opinion
score, are considered the" gold standard" for speech quality evaluation, with the mean …

[PDF] duke.edu

[PDF][PDF] Bridging Facial Imagery and Vocal Reality: Stable Diffusion-Enhanced Voice Generation

Y Lin, D Liu, Y Xu, H Suo, M Li - sites.duke.edu

Generating novel voices in speech synthesis is a challenging task with potential for creating
versatile voices that are needed in entertainment and research. One of the primary obstacles …