Robust formant tracking for continuous speech with speaker variability

A task-optimized neural network replicates human auditory behavior, predicts brain responses, and reveals a cortical processing hierarchy

AJE Kell, DLK Yamins, EN Shook… - Neuron, 2018 - cell.com

A core goal of auditory neuroscience is to build quantitative models that predict cortical
responses to natural sounds. Reasoning that a complete model of auditory cortex must solve …

被引用次数：602 相关文章所有 12 个版本

[PDF] utdallas.edu

Quantitative acoustic measurements for characterization of speech and voice disorders in early untreated Parkinson's disease

J Rusz, R Cmejla, H Ruzickova… - The journal of the …, 2011 - pubs.aip.org

An assessment of vocal impairment is presented for separating healthy people from persons
with early untreated Parkinson's disease (PD). This study's main purpose was to (a) …

被引用次数：571 相关文章所有 7 个版本

Self-reported symptoms of depression and PTSD are associated with reduced vowel space in screening interviews

S Scherer, GM Lucas, J Gratch… - IEEE Transactions …, 2015 - ieeexplore.ieee.org

Reduced frequency range in vowel production is a well documented speech characteristic of
individuals with psychological and neurological disorders. Affective disorders such as …

被引用次数：144 相关文章所有 3 个版本

[PDF] pnas.org Full View

Schema learning for the cocktail party problem

KJP Woods, JH McDermott - Proceedings of the National …, 2018 - National Acad Sciences

The cocktail party problem requires listeners to infer individual sound sources from mixtures
of sound. The problem can be solved only by leveraging regularities in natural sound …

被引用次数：64 相关文章所有 12 个版本

[PDF] ed.ac.uk

Speech synthesis for the generation of artificial personality

MP Aylett, A Vinciarelli, M Wester - IEEE transactions on …, 2017 - ieeexplore.ieee.org

A synthetic voice personifies the system using it. In this work we examine the impact text
content, voice quality and synthesis system have on the perceived personality of two …

被引用次数：45 相关文章所有 4 个版本

[PDF] lintech.org

A comprehensive vowel space for whispered speech

HR Sharifzadeh, IV McLoughlin, MJ Russell - Journal of voice, 2012 - Elsevier

Whispered speech is a relatively common form of communications, used primarily to
selectively exclude or include potential listeners from hearing a spoken message. Despite …

被引用次数：78 相关文章所有 16 个版本

[HTML] sciencedirect.com

[HTML][HTML] Refining a deep learning-based formant tracker using linear prediction methods

P Alku, SR Kadiri, D Gowda - Computer Speech & Language, 2023 - Elsevier

In this study, formant tracking is investigated by refining the formants tracked by an existing
data-driven tracker, DeepFormants, using the formants estimated in a model-driven manner …

被引用次数：8 相关文章所有 10 个版本

[PDF] upenn.edu

Masc: A speech corpus in mandarin for emotion analysis and affective speaker recognition

T Wu, Y Yang, Z Wu, D Li - 2006 IEEE odyssey-the speaker and …, 2006 - ieeexplore.ieee.org

In this paper, a large emotional speech database MASC (Mandarin affective speech corpus)
is introduced. The database contains recordings of 68 native speakers (23 female and 45 …

被引用次数：70 相关文章所有 4 个版本

[PDF] researchgate.net

Effects of telephone transmission on the performance of formant-trajectory-based forensic voice comparison–female voices

C Zhang, GS Morrison, E Enzinger, F Ochoa - Speech Communication, 2013 - Elsevier

In forensic-voice-comparison casework a common scenario is that the suspect's voice is
recorded directly using a microphone in an interview room but the offender's voice is …

被引用次数：50 相关文章所有 5 个版本

[PDF] whiterose.ac.uk

Making accurate formant measurements: An empirical investigation of the influence of the measurement tool, analysis settings and speaker on formant measurements

P Harrison - 2013 - etheses.whiterose.ac.uk

The aim of this thesis is to provide guidance and information that will assist forensic speech
scientists, and phoneticians generally, in making more accurate formant measurements …

被引用次数：44 相关文章所有 3 个版本