One-class learning towards synthetic voice spoofing detection

Y Zhang, F Jiang, Z Duan - IEEE Signal Processing Letters, 2021 - ieeexplore.ieee.org
Human voices can be used to authenticate the identity of the speaker, but the automatic
speaker verification (ASV) systems are vulnerable to voice spoofing attacks, such as …

Voice conversion challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion

Y Zhao, WC Huang, X Tian, J Yamagishi… - arXiv preprint arXiv …, 2020 - arxiv.org
The voice conversion challenge is a bi-annual scientific event held to compare and
understand different voice conversion (VC) systems built on a common dataset. In 2020, we …

Generalization ability of MOS prediction networks

E Cooper, WC Huang, T Toda… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
Automatic methods to predict listener opinions of synthesized speech remain elusive since
listeners, systems being evaluated, characteristics of the speech, and even the instructions …

Deepfake: definitions, performance metrics and standards, datasets and benchmarks, and a meta-review

E Altuncu, VNL Franqueira, S Li - arXiv preprint arXiv:2208.10913, 2022 - arxiv.org
Recent advancements in AI, especially deep learning, have contributed to a significant
increase in the creation of new realistic-looking synthetic media (video, image, and audio) …

Starganv2-vc: A diverse, unsupervised, non-parallel framework for natural-sounding voice conversion

YA Li, A Zare, N Mesgarani - arXiv preprint arXiv:2107.10394, 2021 - arxiv.org
We present an unsupervised non-parallel many-to-many voice conversion (VC) method
using a generative adversarial network (GAN) called StarGAN v2. Using a combination of …

Evaluation of an audio-video multimodal deepfake dataset using unimodal and multimodal detectors

H Khalid, M Kim, S Tariq, SS Woo - Proceedings of the 1st workshop on …, 2021 - dl.acm.org
Significant advancements made in the generation of deepfakes have caused security and
privacy issues. Attackers can easily impersonate a person's identity in an image by replacing …

Human perception of audio deepfakes

NM Müller, K Pizzi, J Williams - … of the 1st International Workshop on …, 2022 - dl.acm.org
The recent emergence of deepfakes has brought manipulated and generated content to the
forefront of machine learning research. Automatic detection of deepfakes has seen many …

How do voices from past speech synthesis challenges compare today?

E Cooper, J Yamagishi - arXiv preprint arXiv:2105.02373, 2021 - arxiv.org
Shared challenges provide a venue for comparing systems trained on common data using a
standardized evaluation, and they also provide an invaluable resource for researchers when …

UR channel-robust synthetic speech detection system for ASVspoof 2021

X Chen, Y Zhang, G Zhu, Z Duan - arXiv preprint arXiv:2107.12018, 2021 - arxiv.org
In this paper, we present UR-AIR system submission to the logical access (LA) and the
speech deepfake (DF) tracks of the ASVspoof 2021 Challenge. The LA and DF tasks focus …

[PDF][PDF] Known-unknown data augmentation strategies for detection of logical access, physical access and speech deepfake attacks: ASVspoof 2021

RK Das - Proc. 2021 Edition of the Automatic Speaker …, 2021 - isca-archive.org
The rise in demand of voice biometric systems also increases the threat from various kinds
of spoofing attacks from unauthorized users. The latest ASVspoof 2021 challenge devotes to …