[PDF][PDF] Where's the uh, hesitation? The interplay between filled pause location, speech rate and fundamental frequency in perception of confidence.
Much of the research investigating the perception of speaker certainty has relied on either
attempting to elicit prosodic features in read speech, or artificial manipulation of recorded …
attempting to elicit prosodic features in read speech, or artificial manipulation of recorded …
Prosody-controllable spontaneous TTS with neural HMMs
Spontaneous speech has many affective and pragmatic functions that are interesting and
challenging to model in TTS. However, the presence of reduced articulation, fillers …
challenging to model in TTS. However, the presence of reduced articulation, fillers …
Breathing and speech planning in spontaneous speech synthesis
Breathing and speech planning in spontaneous speech are coordinated processes, often
exhibiting disfluent patterns. While synthetic speech is not subject to respiratory needs …
exhibiting disfluent patterns. While synthetic speech is not subject to respiratory needs …
Fillers in spoken language understanding: Computational and psycholinguistic perspectives
Disfluencies (ie interruptions in the regular flow of speech), are ubiquitous to spoken
discourse. Fillers (" uh"," um") are disfluencies that occur the most frequently compared to …
discourse. Fillers (" uh"," um") are disfluencies that occur the most frequently compared to …
Perception of concatenative vs. neural text-to-speech (TTS): Differences in intelligibility in noise and language attitudes
This study tests speech-in-noise perception and social ratings of speech produced by
different text-to-speech (TTS) synthesis methods. We used identical speaker training …
different text-to-speech (TTS) synthesis methods. We used identical speaker training …
[PDF][PDF] Personality in the mix-investigating the contribution of fillers and speaking style to the perception of spontaneous speech synthesis
Studies on human-human interactions have shown that that the fluency of a speaker
influences the perception of personality. Adding fillers and discourse markers can make the …
influences the perception of personality. Adding fillers and discourse markers can make the …
Evaluating sampling-based filler insertion with spontaneous tts
Inserting fillers (such as “um”,“like”) to clean speech text has a rich history of study. One
major application is to make dialogue systems sound more spontaneous. The ambiguity of …
major application is to make dialogue systems sound more spontaneous. The ambiguity of …
Synthesis after a couple PINTs: Investigating the role of pause-internal phonetic particles in speech synthesis and perception
Pause-internal phonetic particles (PINTs), such as breath noises, tongue clicks and
hesitations, play an important role in speech perception but are rarely modeled in speech …
hesitations, play an important role in speech perception but are rarely modeled in speech …
Transplantation of conversational speaking style with interjections in sequence-to-sequence speech synthesis
Sequence-to-Sequence Text-to-Speech architectures that directly generate low level
acoustic features from phonetic sequences are known to produce natural and expressive …
acoustic features from phonetic sequences are known to produce natural and expressive …
On the use of self-supervised speech representations in spontaneous speech synthesis
Self-supervised learning (SSL) speech representations learned from large amounts of
diverse, mixed-quality speech data without transcriptions are gaining ground in many …
diverse, mixed-quality speech data without transcriptions are gaining ground in many …