EmphAssess: a Prosodic Benchmark on Assessing Emphasis Transfer in Speech-to-Speech Models
M de Seyssel, A D'Avirro, A Williams… - arXiv preprint arXiv …, 2023 - arxiv.org
We introduce EmphAssess, a prosodic benchmark designed to evaluate the capability of
speech-to-speech models to encode and reproduce prosodic emphasis. We apply this to …
speech-to-speech models to encode and reproduce prosodic emphasis. We apply this to …
Controllable Emphasis with zero data for text-to-speech
We present a scalable method to produce high quality emphasis for text-to-speech (TTS)
that does not require recordings or annotations. Many TTS models include a phoneme …
that does not require recordings or annotations. Many TTS models include a phoneme …
[PDF][PDF] Corrective focus detection in italian speech using neural networks
A López-Zorrilla, M deVelasco-Vázquez… - Acta Polytechnica …, 2018 - acta.uni-obuda.hu
The corrective focus is a particular kind of prosodic prominence where the speaker is
intended to correct or to emphasize a concept. This work develops an Artificial Cognitive …
intended to correct or to emphasize a concept. This work develops an Artificial Cognitive …
Detection of Emphasis Words in Short Texts–A Context Aware Label Distribution Learning Approach
In multi-label classification problems, the predominant approach is to transform the problem
into a single-label classification problem that can result in the affirmative classification of …
into a single-label classification problem that can result in the affirmative classification of …