A review on subjective and objective evaluation of synthetic speech
Evaluating synthetic speech generated by machines is a complicated process, as it involves
judging along multiple dimensions including naturalness, intelligibility, and whether the …
judging along multiple dimensions including naturalness, intelligibility, and whether the …
A study on incorporating Whisper for robust speech assessment
This research introduces an enhanced version of the multi-objective speech assessment
model–MOSA-Net+, by leveraging the acoustic features from Whisper, a large-scaled …
model–MOSA-Net+, by leveraging the acoustic features from Whisper, a large-scaled …
LE-SSL-MOS: Self-Supervised Learning MOS Prediction with Listener Enhancement
Recently, researchers have shown an increasing interest in automatically predicting the
subjective evaluation for speech synthesis systems. This prediction is a challenging task …
subjective evaluation for speech synthesis systems. This prediction is a challenging task …
RAMP: Retrieval-Augmented MOS Prediction via Confidence-based Dynamic Weighting
Automatic Mean Opinion Score (MOS) prediction is crucial to evaluate the perceptual quality
of the synthetic speech. While recent approaches using pre-trained self-supervised learning …
of the synthetic speech. While recent approaches using pre-trained self-supervised learning …
[HTML][HTML] Multi-objective non-intrusive hearing-aid speech assessment model
Because a reference signal is often unavailable in real-world scenarios, reference-free
speech quality and intelligibility assessment models are important for many speech …
speech quality and intelligibility assessment models are important for many speech …
Coded Speech Quality Measurement by a Non-Intrusive PESQ-DNN
Wideband codecs such as AMR-WB or EVS are widely used in (mobile) speech
communication. Evaluation of coded speech quality is often performed subjectively by an …
communication. Evaluation of coded speech quality is often performed subjectively by an …
MOS-FAD: Improving Fake Audio Detection Via Automatic Mean Opinion Score Prediction
IEEE Automatic Mean Opinion Score (MOS) prediction is employed to evaluate the quality of
synthetic speech. This study extends the application of predicted MOS to the task of Fake …
synthetic speech. This study extends the application of predicted MOS to the task of Fake …
SQAT-LD: SPeech Quality Assessment Transformer Utilizing Listener Dependent Modeling for Zero-Shot Out-of-Domain MOS Prediction
In this paper, we propose the speech quality assessment transformer utilizing listener
dependent modeling (SQAT-LD) mean opinion score (MOS) prediction system, which was …
dependent modeling (SQAT-LD) mean opinion score (MOS) prediction system, which was …
Investigating content-aware neural text-to-speech mos prediction using prosodic and linguistic features
Current state-of-the-art methods for automatic synthetic speech evaluation are based on
MOS prediction neural models. Such MOS prediction models include MOSNet and LDNet …
MOS prediction neural models. Such MOS prediction models include MOSNet and LDNet …
SingMOS: An extensive Open-Source Singing Voice Dataset for MOS Prediction
In speech generation tasks, human subjective ratings, usually referred to as the opinion
score, are considered the" gold standard" for speech quality evaluation, with the mean …
score, are considered the" gold standard" for speech quality evaluation, with the mean …