DNSMOS: A non-intrusive perceptual objective speech quality metric to evaluate noise suppressors
Human subjective evaluation is the" gold standard" to evaluate speech quality optimized for
human perception. Perceptual objective metrics serve as a proxy for subjective scores. The …
human perception. Perceptual objective metrics serve as a proxy for subjective scores. The …
DNSMOS P. 835: A non-intrusive perceptual objective speech quality metric to evaluate noise suppressors
Human subjective evaluation is the" gold standard" to evaluate speech quality optimized for
human perception. Perceptual objective metrics serve as a proxy for subjective scores. We …
human perception. Perceptual objective metrics serve as a proxy for subjective scores. We …
ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech
Automatic speaker verification (ASV) is one of the most natural and convenient means of
biometric person recognition. Unfortunately, just like all other biometric systems, ASV is …
biometric person recognition. Unfortunately, just like all other biometric systems, ASV is …
Deepfake: definitions, performance metrics and standards, datasets and benchmarks, and a meta-review
Recent advancements in AI, especially deep learning, have contributed to a significant
increase in the creation of new realistic-looking synthetic media (video, image, and audio) …
increase in the creation of new realistic-looking synthetic media (video, image, and audio) …
[HTML][HTML] Video and audio deepfake datasets and open issues in deepfake technology: being ahead of the curve
Z Akhtar, TL Pendyala, VS Athmakuri - Forensic Sciences, 2024 - mdpi.com
The revolutionary breakthroughs in Machine Learning (ML) and Artificial Intelligence (AI) are
extensively being harnessed across a diverse range of domains, eg, forensic science …
extensively being harnessed across a diverse range of domains, eg, forensic science …
Synthetic speech detection through short-term and long-term prediction traces
Several methods for synthetic audio speech generation have been developed in the
literature through the years. With the great technological advances brought by deep …
literature through the years. With the great technological advances brought by deep …
An explainability study of the constant Q cepstral coefficient spoofing countermeasure for automatic speaker verification
Anti-spoofing for automatic speaker verification is now a well established area of research,
with three competitive challenges having been held in the last 6 years. A great deal of …
with three competitive challenges having been held in the last 6 years. A great deal of …
Generative adversarial networks in human emotion synthesis: A review
N Hajarolasvadi, MA Ramirez, W Beccaro… - IEEE …, 2020 - ieeexplore.ieee.org
Deep generative models have become an emerging topic in various research areas like
computer vision and signal processing. These models allow synthesizing realistic data …
computer vision and signal processing. These models allow synthesizing realistic data …
How convolutional neural networks deal with aliasing
AH Ribeiro, TB Schön - ICASSP 2021-2021 IEEE International …, 2021 - ieeexplore.ieee.org
The convolutional neural network (CNN) remains an essential tool in solving computer
vision problems. Standard convolutional architectures consist of stacked layers of operations …
vision problems. Standard convolutional architectures consist of stacked layers of operations …
An adaptive-learning-based generative adversarial network for one-to-one voice conversion
Voice conversion (VC) emerged as a significant domain of research in the field of speech
synthesis in recent years due to its emerging application in voice-assistive technologies …
synthesis in recent years due to its emerging application in voice-assistive technologies …