Attentive Merging of Hidden Embeddings from Pre-trained Speech Model for Anti-spoofing Detection

Y Zhang, Y Zang, J Shi, R Yamamoto, T Toda… - arXiv preprint arXiv …, 2024 - arxiv.org

With the advancements in singing voice generation and the growing presence of AI singers
on media platforms, the inaugural Singing Voice Deepfake Detection (SVDD) Challenge …

被引用次数：3 相关文章所有 4 个版本

Speech foundation model ensembles for the controlled singing voice deepfake detection (ctrsvdd) challenge 2024

A Guragain, T Liu, Z Pan, HB Sailor, Q Wang - arXiv preprint arXiv …, 2024 - arxiv.org

This work details our approach to achieving a leading system with a 1.79% pooled equal
error rate (EER) on the evaluation set of the Controlled Singing Voice Deepfake Detection …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

[PDF][PDF] Mixture of Experts Fusion for Fake Audio Detection Using Frozen wav2vec 2.0

Z Wang, R Fu, Z Wen, J Tao, X Wang, Y Xie… - arXiv preprint arXiv …, 2024 - arxiv.org

Speech synthesis technology has posed a serious threat to speaker verification systems.
Currently, the most effective fake audio detection methods utilize pretrained models, and …

被引用次数：1 相关文章所有 4 个版本

[PDF] arxiv.org

Towards Quantifying and Reducing Language Mismatch Effects in Cross-Lingual Speech Anti-Spoofing

T Liu, I Kukanov, Z Pan, Q Wang, HB Sailor… - arXiv preprint arXiv …, 2024 - arxiv.org

The effects of language mismatch impact speech anti-spoofing systems, while investigations
and quantification of these effects remain limited. Existing anti-spoofing datasets are mainly …

DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset

J Du, I Lin, I Chiu, X Chen, H Wu, W Ren… - arXiv preprint arXiv …, 2024 - arxiv.org

Mainstream zero-shot TTS production systems like Voicebox and Seed-TTS achieve human
parity speech by leveraging Flow-matching and Diffusion models, respectively …