Whisper-SV: Adapting Whisper for low-data-resource speaker verification

L Zhang, N Jiang, Q Wang, Y Li, Q Lu, L Xie - Speech Communication, 2024 - Elsevier
Trained on 680,000 h of massive speech data, Whisper is a multitasking, multilingual
speech foundation model demonstrating superior performance in automatic speech …

[HTML][HTML] AVID: A speech database for machine learning studies on vocal intensity

P Alku, M Kodali, L Laaksonen, SR Kadiri - Speech Communication, 2024 - Elsevier
Vocal intensity, which is quantified typically with the sound pressure level (SPL), is a key
feature of speech. To measure SPL from speech recordings, a standard calibration tone …

Whisper in Focus: Enhancing Stuttered Speech Classification with Encoder Layer Optimization

H Ameer, S Latif, R Latif, S Mukhtar - arXiv preprint arXiv:2311.05203, 2023 - arxiv.org
In recent years, advancements in the field of speech processing have led to cutting-edge
deep learning algorithms with immense potential for real-world applications. The automated …

[PDF][PDF] Deep Learning for Automatic Classification of Speech Intensity Modes

L Ansari - 2023 - aaltodoc.aalto.fi
One of the fundamental phenomena in speech processing is speech intensity. As a concept,
speech intensity and its regulation help capture various aspects as well as changes in the …

Whisper-Sv: Adapting Whisper for Low-Resource Speaker Verification

L Zhang, N Jiang, Q Wang, Y Li, Q Lu, L Xie - Available at SSRN 4725202 - papers.ssrn.com
Recent advancements in automatic speech recognition (ASR), exemplified by Whisper, have
demonstrated the potential of these systems to approach human-level performance given …