Increasing prosodic variability of text-to-speech synthesizers.

G Zellou, N Holliday - Frontiers in Computer Science, 2024 - frontiersin.org

This article reviews recent literature investigating speech variation in production and
comprehension during spoken language communication between humans and devices …

被引用次数：1 相关文章

[PDF] aclanthology.org

A large-scale user study of an Alexa prize chatbot: Effect of TTS dynamism on perceived quality of social dialog

M Cohn, CY Chen, Z Yu - Proceedings of the 20th Annual SIGdial …, 2019 - aclanthology.org

This study tests the effect of cognitive-emotional expression in an Alexa text-to-speech (TTS)
voice on users' experience with a social dialog system. We systematically introduced …

被引用次数：33 相关文章所有 5 个版本

Extracting and predicting word-level style variations for speech synthesis

YJ Zhang, ZH Ling - IEEE/ACM Transactions on Audio, Speech …, 2021 - ieeexplore.ieee.org

This paper proposes a speech synthesis method based on unsupervisedly-learned fine-
grained style representations, named word-level style variations (WSVs), in order to improve …

被引用次数：16 相关文章所有 2 个版本

[PDF] lsadc.org

Listener beliefs and perceptual learning: Differences between device and human guises

G Zellou, M Cohn, A Pycha - Language, 2023 - muse.jhu.edu

Listeners have a remarkable ability to adapt to novel speech patterns, such as a new accent
or an idiosyncratic pronunciation. In almost all of the previous studies examining this …

被引用次数：3 相关文章所有 5 个版本

[HTML] aip.org

[HTML][HTML] Partial compensation for coarticulatory vowel nasalization across concatenative and neural text-to-speech

G Zellou, M Cohn, A Block - The Journal of the Acoustical Society of …, 2021 - pubs.aip.org

This study investigates the perception of coarticulatory vowel nasality generated using
different text-to-speech (TTS) methods in American English. Experiment 1 compared …

被引用次数：9 相关文章所有 6 个版本

[PDF] cognitivesciencesociety.org

[PDF][PDF] Top-down effects of apparent humanness on vocal alignment toward human and device interlocutors.

G Zellou, M Cohn - CogSci, 2020 - cognitivesciencesociety.org

Humans are now regularly speaking to voice-activated artificially intelligent (voice-AI)
assistants. Yet, our understanding of the cognitive mechanisms at play during speech …

被引用次数：8 相关文章所有 3 个版本

[PDF] sciencedirect.com

Improving HMM speech synthesis of interrogative sentences by pitch track transformations

P Nagy, G Németh - Speech Communication, 2016 - Elsevier

Modeling interrogative sentence prosody is a challenging task due to the significant
variation of questions. Prosody is produced by intonation, intensity and duration features …

被引用次数：11 相关文章所有 4 个版本

[PDF] researchgate.net

[PDF][PDF] Perceptual Adaptation to Device and Human Voices: Learning and Generalization of a Phonetic Shift Across Real and Voice-AI Talkers.

BF Segedin, M Cohn, G Zellou - INTERSPEECH, 2019 - researchgate.net

Voice-activated artificially-intelligent digital devices are a new type of interlocutor. Like for
human talkers, they have idiosyncratic speech patterns that require listeners to perceptually …

被引用次数：8 相关文章所有 6 个版本

Identity Disclosure and Anthropomorphism in Voice Chatbot Design: A Field Experiment

Y Xu, H Dai, W Yan - Kenan Institute of Private Enterprise Research …, 2022 - papers.ssrn.com

Fueled by the widespread adoption of algorithms and artificial intelligence (AI), the use of
chatbots has become increasingly popular in various business contexts. In this paper, we …

[PDF] researchgate.net

[PDF][PDF] Generation of pitch curves for Macedonian text-to-speech synthesis

B Gerazov, Z Ivanovski - 6th Forum Acusticum, Aalborg, Denmark, 2011 - researchgate.net

The paper presents in detail the pitch curve generation algorithm, developed for
Macedonian textto-speech (TTS) synthesis. The algorithm is part of the prosody generation …

被引用次数：3 相关文章