Prominence features: Effective emotional features for speech emotion recognition

S Jing, X Mao, L Chen - Digital Signal Processing, 2018 - Elsevier
Emotion-related feature extraction is a challenging task in speech emotion recognition. Due
to the lack of discriminative acoustic features, classical approaches based on traditional …

[PDF][PDF] BLSTM-CRF Based End-to-End Prosodic Boundary Prediction with Context Sensitive Embeddings in a Text-to-Speech Front-End.

Y Zheng, J Tao, Z Wen, Y Li - Interspeech, 2018 - isca-archive.org
In this paper, we propose a language-independent end-to-end architecture for prosodic
boundary prediction based on BLSTMCRF. The proposed architecture has three …

Using continuous lexical embeddings to improve symbolic-prosody prediction in a text-to-speech front-end

A Rendel, R Fernandez, R Hoory… - … on Acoustics, Speech …, 2016 - ieeexplore.ieee.org
The prediction of symbolic prosodic categories from text is an important, but challenging,
natural-language processing task given the various ways in which an input can be realized …

Deep learning for prominence detection in children's read speech

M Vaidya, K Sabu, P Rao - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
The detection of perceived prominence in speech has attracted approaches ranging from
the design of knowledge-based linguistic and acoustic features to the automatic feature …

Acoustic and temporal representations in convolutional neural network models of prosodic events

S Stehwien, A Schweitzer, NT Vu - Speech Communication, 2020 - Elsevier
Prosodic events such as pitch accents and phrase boundaries have various acoustic and
temporal correlates that are used as features in machine learning models to automatically …

3PRO–An unsupervised method for the automatic detection of sentence prominence in speech

S Kakouros, O Räsänen - Speech Communication, 2016 - Elsevier
Automatic detection of prominence in speech has attracted interest in recent years due to its
multiple uses in spoken language applications. However, typical approaches require …

Psst! prosodic speech segmentation with transformers

N Roll, C Graham, S Todd - arXiv preprint arXiv:2302.01984, 2023 - arxiv.org
Self-attention mechanisms have enabled transformers to achieve superhuman-level
performance on many speech-to-text (STT) tasks, yet the challenge of automatic prosodic …

[PDF][PDF] Improving Prosodic Boundaries Prediction for Mandarin Speech Synthesis by Using Enhanced Embedding Feature and Model Fusion Approach.

Y Zheng, Y Li, Z Wen, X Ding, J Tao - INTERSPEECH, 2016 - isca-archive.org
Hierarchical prosody structure generation is an important but challenging component for
speech synthesis systems. In this paper, we investigate the use of enhanced embedding …

Prosodic event recognition using convolutional neural networks with context information

S Stehwien, NT Vu - arXiv preprint arXiv:1706.00741, 2017 - arxiv.org
This paper demonstrates the potential of convolutional neural networks (CNN) for detecting
and classifying prosodic events on words, specifically pitch accents and phrase boundary …

[PDF][PDF] Speech, prosody, and machines: Nine challenges for prosody research

A Rosenberg - Proc. Speech Prosody, 2018 - isca-archive.org
Speech technology is becoming commonplace. Traditional telephony based interactive
voice systems have been joined by virtual assistants and navigation systems to create a …