EmphAssess: a Prosodic Benchmark on Assessing Emphasis Transfer in Speech-to-Speech Models

M de Seyssel, A D'Avirro, A Williams… - arXiv preprint arXiv …, 2023 - arxiv.org
We introduce EmphAssess, a prosodic benchmark designed to evaluate the capability of
speech-to-speech models to encode and reproduce prosodic emphasis. We apply this to …

Controllable Emphasis with zero data for text-to-speech

A Joly, M Nicolis, E Peterova, A Lombardi… - arXiv preprint arXiv …, 2023 - arxiv.org
We present a scalable method to produce high quality emphasis for text-to-speech (TTS)
that does not require recordings or annotations. Many TTS models include a phoneme …

[PDF][PDF] Corrective focus detection in italian speech using neural networks

A López-Zorrilla, M deVelasco-Vázquez… - Acta Polytechnica …, 2018 - acta.uni-obuda.hu
The corrective focus is a particular kind of prosodic prominence where the speaker is
intended to correct or to emphasize a concept. This work develops an Artificial Cognitive …

Detection of Emphasis Words in Short Texts–A Context Aware Label Distribution Learning Approach

Meghana, B Das - Advanced Informatics for Computing Research: 4th …, 2021 - Springer
In multi-label classification problems, the predominant approach is to transform the problem
into a single-label classification problem that can result in the affirmative classification of …