Ldnet: Unified listener dependent modeling in mos prediction for synthetic speech

WC Huang, E Cooper, J Yamagishi… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
An effective approach to automatically predict the subjective rating for synthetic speech is to
train on a listening test dataset with human-annotated scores. Although each speech sample …

Torchaudio-squim: Reference-less speech quality and intelligibility measures in torchaudio

A Kumar, K Tan, Z Ni, P Manocha… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Measuring quality and intelligibility of a speech signal is usually a critical step in
development of speech processing systems. To enable this, a variety of metrics to measure …

Speech quality assessment through MOS using non-matching references

P Manocha, A Kumar - arXiv preprint arXiv:2206.12285, 2022 - arxiv.org
Human judgments obtained through Mean Opinion Scores (MOS) are the most reliable way
to assess the quality of speech signals. However, several recent attempts to automatically …

基于特征融合的通信语音干扰效果客观评估

林云, 徐怀韬, 王森, 张思成, 庄龙 - 通信学报, 2023 - infocomm-journal.com
针对通信语音干扰效果客观评估问题, 提出了基于多测度与多模态融合的2 种评估方法. 首先,
利用端点检测算法以及动态时间弯折算法对受扰语音数据进行预处理. 然后 …

Audio similarity is unreliable as a proxy for audio quality

P Manocha, Z Jin, A Finkelstein - arXiv preprint arXiv:2206.13411, 2022 - arxiv.org
Many audio processing tasks require perceptual assessment. However, the time and
expense of obtaining``gold standard''human judgments limit the availability of such data …

REAL-M: Towards speech separation on real mixtures

C Subakan, M Ravanelli, S Cornell… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
In recent years, deep learning based source separation has achieved impressive results.
Most studies, however, still evaluate separation models on synthetic datasets, while the …

Audio Visual Speaker Localization from EgoCentric Views

J Zhao, Y Xu, X Qian, W Wang - arXiv preprint arXiv:2309.16308, 2023 - arxiv.org
The use of audio and visual modality for speaker localization has been well studied in the
literature by exploiting their complementary characteristics. However, most previous works …

Non-Intrusive Speech Quality Assessment Based on Deep Neural Networks for Speech Communication

M Liu, J Wang, F Wang, F Xiang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Traditionally, speech quality evaluation relies on subjective assessments or intrusive
methods that require reference signals or additional equipment. However, over recent years …

On intrusive speech quality measures and a global SNR based metric

C Pan, J Chen, J Benesty - Speech Communication, 2024 - Elsevier
Measuring the quality of noisy speech signals has been an increasingly important problem
in the field of speech processing as more and more speech-communication and human …

Efficient speech quality assessment using self-supervised framewise embeddings

K El Hajal, Z Wu, N Scheidwasser-Clow… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Automatic speech quality assessment is essential for audio researchers, developers, speech
and language pathologists, and system quality engineers. The current state-of-the-art …