Metricnet: Towards improved modeling for non-intrusive speech quality assessment

WC Huang, E Cooper, J Yamagishi… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

An effective approach to automatically predict the subjective rating for synthetic speech is to
train on a listening test dataset with human-annotated scores. Although each speech sample …

被引用次数：67 相关文章所有 4 个版本

[PDF] arxiv.org

Torchaudio-squim: Reference-less speech quality and intelligibility measures in torchaudio

A Kumar, K Tan, Z Ni, P Manocha… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

Measuring quality and intelligibility of a speech signal is usually a critical step in
development of speech processing systems. To enable this, a variety of metrics to measure …

被引用次数：31 相关文章所有 3 个版本

[PDF] arxiv.org

Speech quality assessment through MOS using non-matching references

P Manocha, A Kumar - arXiv preprint arXiv:2206.12285, 2022 - arxiv.org

Human judgments obtained through Mean Opinion Scores (MOS) are the most reliable way
to assess the quality of speech signals. However, several recent attempts to automatically …

被引用次数：24 相关文章所有 5 个版本

[PDF] infocomm-journal.com

基于特征融合的通信语音干扰效果客观评估

林云，徐怀韬，王森，张思成，庄龙 - 通信学报, 2023 - infocomm-journal.com

针对通信语音干扰效果客观评估问题, 提出了基于多测度与多模态融合的2 种评估方法. 首先,
利用端点检测算法以及动态时间弯折算法对受扰语音数据进行预处理. 然后 …

被引用次数：3 相关文章所有 3 个版本

[PDF] arxiv.org

Audio similarity is unreliable as a proxy for audio quality

P Manocha, Z Jin, A Finkelstein - arXiv preprint arXiv:2206.13411, 2022 - arxiv.org

Many audio processing tasks require perceptual assessment. However, the time and
expense of obtaining``gold standard''human judgments limit the availability of such data …

被引用次数：8 相关文章所有 7 个版本

[PDF] arxiv.org

REAL-M: Towards speech separation on real mixtures

C Subakan, M Ravanelli, S Cornell… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

In recent years, deep learning based source separation has achieved impressive results.
Most studies, however, still evaluate separation models on synthetic datasets, while the …

被引用次数：17 相关文章所有 4 个版本

[PDF] arxiv.org

Audio Visual Speaker Localization from EgoCentric Views

J Zhao, Y Xu, X Qian, W Wang - arXiv preprint arXiv:2309.16308, 2023 - arxiv.org

The use of audio and visual modality for speaker localization has been well studied in the
literature by exploiting their complementary characteristics. However, most previous works …

被引用次数：5 相关文章所有 3 个版本

Non-Intrusive Speech Quality Assessment Based on Deep Neural Networks for Speech Communication

M Liu, J Wang, F Wang, F Xiang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Traditionally, speech quality evaluation relies on subjective assessments or intrusive
methods that require reference signals or additional equipment. However, over recent years …

被引用次数：3 相关文章所有 3 个版本

On intrusive speech quality measures and a global SNR based metric

C Pan, J Chen, J Benesty - Speech Communication, 2024 - Elsevier

Measuring the quality of noisy speech signals has been an increasingly important problem
in the field of speech processing as more and more speech-communication and human …

被引用次数：1 相关文章

[PDF] arxiv.org

Efficient speech quality assessment using self-supervised framewise embeddings

K El Hajal, Z Wu, N Scheidwasser-Clow… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

Automatic speech quality assessment is essential for audio researchers, developers, speech
and language pathologists, and system quality engineers. The current state-of-the-art …

被引用次数：7 相关文章所有 3 个版本