Prominence features: Effective emotional features for speech emotion recognition
Emotion-related feature extraction is a challenging task in speech emotion recognition. Due
to the lack of discriminative acoustic features, classical approaches based on traditional …
to the lack of discriminative acoustic features, classical approaches based on traditional …
[PDF][PDF] BLSTM-CRF Based End-to-End Prosodic Boundary Prediction with Context Sensitive Embeddings in a Text-to-Speech Front-End.
Y Zheng, J Tao, Z Wen, Y Li - Interspeech, 2018 - isca-archive.org
In this paper, we propose a language-independent end-to-end architecture for prosodic
boundary prediction based on BLSTMCRF. The proposed architecture has three …
boundary prediction based on BLSTMCRF. The proposed architecture has three …
Using continuous lexical embeddings to improve symbolic-prosody prediction in a text-to-speech front-end
A Rendel, R Fernandez, R Hoory… - … on Acoustics, Speech …, 2016 - ieeexplore.ieee.org
The prediction of symbolic prosodic categories from text is an important, but challenging,
natural-language processing task given the various ways in which an input can be realized …
natural-language processing task given the various ways in which an input can be realized …
Deep learning for prominence detection in children's read speech
The detection of perceived prominence in speech has attracted approaches ranging from
the design of knowledge-based linguistic and acoustic features to the automatic feature …
the design of knowledge-based linguistic and acoustic features to the automatic feature …
Acoustic and temporal representations in convolutional neural network models of prosodic events
S Stehwien, A Schweitzer, NT Vu - Speech Communication, 2020 - Elsevier
Prosodic events such as pitch accents and phrase boundaries have various acoustic and
temporal correlates that are used as features in machine learning models to automatically …
temporal correlates that are used as features in machine learning models to automatically …
3PRO–An unsupervised method for the automatic detection of sentence prominence in speech
S Kakouros, O Räsänen - Speech Communication, 2016 - Elsevier
Automatic detection of prominence in speech has attracted interest in recent years due to its
multiple uses in spoken language applications. However, typical approaches require …
multiple uses in spoken language applications. However, typical approaches require …
Psst! prosodic speech segmentation with transformers
Self-attention mechanisms have enabled transformers to achieve superhuman-level
performance on many speech-to-text (STT) tasks, yet the challenge of automatic prosodic …
performance on many speech-to-text (STT) tasks, yet the challenge of automatic prosodic …
[PDF][PDF] Improving Prosodic Boundaries Prediction for Mandarin Speech Synthesis by Using Enhanced Embedding Feature and Model Fusion Approach.
Y Zheng, Y Li, Z Wen, X Ding, J Tao - INTERSPEECH, 2016 - isca-archive.org
Hierarchical prosody structure generation is an important but challenging component for
speech synthesis systems. In this paper, we investigate the use of enhanced embedding …
speech synthesis systems. In this paper, we investigate the use of enhanced embedding …
Prosodic event recognition using convolutional neural networks with context information
S Stehwien, NT Vu - arXiv preprint arXiv:1706.00741, 2017 - arxiv.org
This paper demonstrates the potential of convolutional neural networks (CNN) for detecting
and classifying prosodic events on words, specifically pitch accents and phrase boundary …
and classifying prosodic events on words, specifically pitch accents and phrase boundary …
[PDF][PDF] Speech, prosody, and machines: Nine challenges for prosody research
A Rosenberg - Proc. Speech Prosody, 2018 - isca-archive.org
Speech technology is becoming commonplace. Traditional telephony based interactive
voice systems have been joined by virtual assistants and navigation systems to create a …
voice systems have been joined by virtual assistants and navigation systems to create a …