A review on subjective and objective evaluation of synthetic speech
Evaluating synthetic speech generated by machines is a complicated process, as it involves
judging along multiple dimensions including naturalness, intelligibility, and whether the …
judging along multiple dimensions including naturalness, intelligibility, and whether the …
LE-SSL-MOS: Self-Supervised Learning MOS Prediction with Listener Enhancement
Z Qi, X Hu, W Zhou, S Li, H Wu, J Lu… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org
Recently, researchers have shown an increasing interest in automatically predicting the
subjective evaluation for speech synthesis systems. This prediction is a challenging task …
subjective evaluation for speech synthesis systems. This prediction is a challenging task …
Self-Supervised Speech Quality Estimation and Enhancement Using Only Clean Speech
Speech quality estimation has recently undergone a paradigm shift from human-hearing
expert designs to machine-learning models. However, current models rely mainly on …
expert designs to machine-learning models. However, current models rely mainly on …
RAMP: Retrieval-Augmented MOS Prediction via Confidence-based Dynamic Weighting
Automatic Mean Opinion Score (MOS) prediction is crucial to evaluate the perceptual quality
of the synthetic speech. While recent approaches using pre-trained self-supervised learning …
of the synthetic speech. While recent approaches using pre-trained self-supervised learning …
MSQAT: A multi-dimension non-intrusive speech quality assessment transformer utilizing self-supervised representations
Convolutional neural networks (CNNs) have been widely utilized as the main building block
for many non-intrusive speech quality assessment (NISQA) methods. A new trend is to add a …
for many non-intrusive speech quality assessment (NISQA) methods. A new trend is to add a …
Partial Rank Similarity Minimization Method for Quality MOS Prediction of Unseen Speech Synthesis Systems in Zero-Shot and Semi-Supervised Setting
H Yadav, E Cooper, J Yamagishi… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org
This paper introduces a novel objective function for quality mean opinion score (MOS)
prediction of unseen speech synthesis systems. The proposed function measures the …
prediction of unseen speech synthesis systems. The proposed function measures the …
SQAT-LD: SPeech Quality Assessment Transformer Utilizing Listener Dependent Modeling for Zero-Shot Out-of-Domain MOS Prediction
In this paper, we propose the speech quality assessment transformer utilizing listener
dependent modeling (SQAT-LD) mean opinion score (MOS) prediction system, which was …
dependent modeling (SQAT-LD) mean opinion score (MOS) prediction system, which was …
MooseNet: A Trainable Metric for Synthesized Speech with a PLDA Module
We present MooseNet, a trainable speech metric that predicts the listeners' Mean Opinion
Score (MOS). We propose a novel approach where the Probabilistic Linear Discriminative …
Score (MOS). We propose a novel approach where the Probabilistic Linear Discriminative …
SingMOS: An extensive Open-Source Singing Voice Dataset for MOS Prediction
In speech generation tasks, human subjective ratings, usually referred to as the opinion
score, are considered the" gold standard" for speech quality evaluation, with the mean …
score, are considered the" gold standard" for speech quality evaluation, with the mean …
Evaluation of Speech Representations for MOS prediction
In this paper, we evaluate feature extraction models for predicting speech quality. We also
propose a model architecture to compare embeddings of supervised learning and self …
propose a model architecture to compare embeddings of supervised learning and self …