Speech processing for robust speaker recognition: Analysis and advancements for whispered speech- 学术资源搜索

Speech processing for robust speaker recognition: Analysis and advancements for whispered speech

JHL Hansen, C Zhang, X Fan - Forensic Speaker Recognition: Law …, 2012 - Springer

Forensic Speaker Recognition: Law Enforcement and Counter-Terrorism, 2012•Springer

Abstract

In the field of voice forensics, the ability to perform effective speaker recognition from input audio streams is an important task. However, in many situations, individuals may prefer to lower their risk of being heard in public settings via whisper mode during communications. It is in precisely these conditions that speaker recognition should remain effective. Limited formal research has been performed in this domain to date. Whisper is an alternative speech production mode used by subjects in public conversation to protect content privacy or identity. Due to the profound differences between whisper and neutral speech in terms of spectral structure, the performance of speaker identification systems trained with neutral speech degrade significantly. In this chapter, studies that address acoustic analysis of whisper will be reviewed. Next, an effective data collection procedure for both spontaneous and read whisper speech will be introduced. An algorithm for whisper speech detection, which is a crucial front-end for whisper speech processing algorithms, will be presented. Finally, a seamless neutral/whisper mismatched closed-set speaker recognition system will be introduced. In the evaluation, a traditional MFCC-GMM system is employed as the baseline speaker ID system. An analysis of both speaker and phoneme variability in speaker ID performance using neutral trained GMMs is provided, which forms the basis for a final combined whisper based speaker ID system is presented. Experimental results are also provided followed by directions for future work.

Springer

展开收起

被引用次数：3 相关文章所有 3 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果