WaveCycleGAN2: Time-domain neural post-filter for speech waveform generation

CKA Reddy, V Gopal, R Cutler - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org

Human subjective evaluation is the" gold standard" to evaluate speech quality optimized for
human perception. Perceptual objective metrics serve as a proxy for subjective scores. The …

被引用次数：298 相关文章所有 4 个版本

[PDF] arxiv.org

DNSMOS P. 835: A non-intrusive perceptual objective speech quality metric to evaluate noise suppressors

CKA Reddy, V Gopal, R Cutler - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org

Human subjective evaluation is the" gold standard" to evaluate speech quality optimized for
human perception. Perceptual objective metrics serve as a proxy for subjective scores. We …

被引用次数：202 相关文章所有 3 个版本

[PDF] sciencedirect.com

ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech

X Wang, J Yamagishi, M Todisco, H Delgado… - Computer Speech & …, 2020 - Elsevier

Automatic speaker verification (ASV) is one of the most natural and convenient means of
biometric person recognition. Unfortunately, just like all other biometric systems, ASV is …

被引用次数：414 相关文章所有 15 个版本

[PDF] arxiv.org

Deepfake: definitions, performance metrics and standards, datasets and benchmarks, and a meta-review

E Altuncu, VNL Franqueira, S Li - arXiv preprint arXiv:2208.10913, 2022 - arxiv.org

Recent advancements in AI, especially deep learning, have contributed to a significant
increase in the creation of new realistic-looking synthetic media (video, image, and audio) …

被引用次数：21 相关文章所有 4 个版本

[HTML] mdpi.com

[HTML][HTML] Video and audio deepfake datasets and open issues in deepfake technology: being ahead of the curve

Z Akhtar, TL Pendyala, VS Athmakuri - Forensic Sciences, 2024 - mdpi.com

The revolutionary breakthroughs in Machine Learning (ML) and Artificial Intelligence (AI) are
extensively being harnessed across a diverse range of domains, eg, forensic science …

被引用次数：5 相关文章

[PDF] springer.com

Synthetic speech detection through short-term and long-term prediction traces

C Borrelli, P Bestagini, F Antonacci, A Sarti… - EURASIP Journal on …, 2021 - Springer

Several methods for synthetic audio speech generation have been developed in the
literature through the years. With the great technological advances brought by deep …

被引用次数：80 相关文章所有 9 个版本

[PDF] arxiv.org

An explainability study of the constant Q cepstral coefficient spoofing countermeasure for automatic speaker verification

H Tak, J Patino, A Nautsch, N Evans… - arXiv preprint arXiv …, 2020 - arxiv.org

Anti-spoofing for automatic speaker verification is now a well established area of research,
with three competitive challenges having been held in the last 6 years. A great deal of …

被引用次数：54 相关文章所有 6 个版本

[PDF] ieee.org

Generative adversarial networks in human emotion synthesis: A review

N Hajarolasvadi, MA Ramirez, W Beccaro… - IEEE …, 2020 - ieeexplore.ieee.org

Deep generative models have become an emerging topic in various research areas like
computer vision and signal processing. These models allow synthesizing realistic data …

被引用次数：31 相关文章所有 15 个版本

[PDF] arxiv.org

How convolutional neural networks deal with aliasing

AH Ribeiro, TB Schön - ICASSP 2021-2021 IEEE International …, 2021 - ieeexplore.ieee.org

The convolutional neural network (CNN) remains an essential tool in solving computer
vision problems. Standard convolutional architectures consist of stacked layers of operations …

被引用次数：27 相关文章所有 6 个版本

[PDF] arxiv.org

An adaptive-learning-based generative adversarial network for one-to-one voice conversion

S Dhar, ND Jana, S Das - IEEE Transactions on artificial …, 2022 - ieeexplore.ieee.org

Voice conversion (VC) emerged as a significant domain of research in the field of speech
synthesis in recent years due to its emerging application in voice-assistive technologies …

被引用次数：20 相关文章所有 4 个版本