A task-optimized neural network replicates human auditory behavior, predicts brain responses, and reveals a cortical processing hierarchy

AJE Kell, DLK Yamins, EN Shook… - Neuron, 2018 - cell.com
A core goal of auditory neuroscience is to build quantitative models that predict cortical
responses to natural sounds. Reasoning that a complete model of auditory cortex must solve …

Quantitative acoustic measurements for characterization of speech and voice disorders in early untreated Parkinson's disease

J Rusz, R Cmejla, H Ruzickova… - The journal of the …, 2011 - pubs.aip.org
An assessment of vocal impairment is presented for separating healthy people from persons
with early untreated Parkinson's disease (PD). This study's main purpose was to (a) …

Self-reported symptoms of depression and PTSD are associated with reduced vowel space in screening interviews

S Scherer, GM Lucas, J Gratch… - IEEE Transactions …, 2015 - ieeexplore.ieee.org
Reduced frequency range in vowel production is a well documented speech characteristic of
individuals with psychological and neurological disorders. Affective disorders such as …

Schema learning for the cocktail party problem

KJP Woods, JH McDermott - Proceedings of the National …, 2018 - National Acad Sciences
The cocktail party problem requires listeners to infer individual sound sources from mixtures
of sound. The problem can be solved only by leveraging regularities in natural sound …

Speech synthesis for the generation of artificial personality

MP Aylett, A Vinciarelli, M Wester - IEEE transactions on …, 2017 - ieeexplore.ieee.org
A synthetic voice personifies the system using it. In this work we examine the impact text
content, voice quality and synthesis system have on the perceived personality of two …

A comprehensive vowel space for whispered speech

HR Sharifzadeh, IV McLoughlin, MJ Russell - Journal of voice, 2012 - Elsevier
Whispered speech is a relatively common form of communications, used primarily to
selectively exclude or include potential listeners from hearing a spoken message. Despite …

[HTML][HTML] Refining a deep learning-based formant tracker using linear prediction methods

P Alku, SR Kadiri, D Gowda - Computer Speech & Language, 2023 - Elsevier
In this study, formant tracking is investigated by refining the formants tracked by an existing
data-driven tracker, DeepFormants, using the formants estimated in a model-driven manner …

Masc: A speech corpus in mandarin for emotion analysis and affective speaker recognition

T Wu, Y Yang, Z Wu, D Li - 2006 IEEE odyssey-the speaker and …, 2006 - ieeexplore.ieee.org
In this paper, a large emotional speech database MASC (Mandarin affective speech corpus)
is introduced. The database contains recordings of 68 native speakers (23 female and 45 …

Effects of telephone transmission on the performance of formant-trajectory-based forensic voice comparison–female voices

C Zhang, GS Morrison, E Enzinger, F Ochoa - Speech Communication, 2013 - Elsevier
In forensic-voice-comparison casework a common scenario is that the suspect's voice is
recorded directly using a microphone in an interview room but the offender's voice is …

Making accurate formant measurements: An empirical investigation of the influence of the measurement tool, analysis settings and speaker on formant measurements

P Harrison - 2013 - etheses.whiterose.ac.uk
The aim of this thesis is to provide guidance and information that will assist forensic speech
scientists, and phoneticians generally, in making more accurate formant measurements …