Modeling phrasing and prominence using deep recurrent learning.

S Jing, X Mao, L Chen - Digital Signal Processing, 2018 - Elsevier

Emotion-related feature extraction is a challenging task in speech emotion recognition. Due
to the lack of discriminative acoustic features, classical approaches based on traditional …

被引用次数：69 相关文章所有 3 个版本

[PDF] isca-archive.org

[PDF][PDF] BLSTM-CRF Based End-to-End Prosodic Boundary Prediction with Context Sensitive Embeddings in a Text-to-Speech Front-End.

Y Zheng, J Tao, Z Wen, Y Li - Interspeech, 2018 - isca-archive.org

In this paper, we propose a language-independent end-to-end architecture for prosodic
boundary prediction based on BLSTMCRF. The proposed architecture has three …

被引用次数：33 相关文章所有 4 个版本

Using continuous lexical embeddings to improve symbolic-prosody prediction in a text-to-speech front-end

A Rendel, R Fernandez, R Hoory… - … on Acoustics, Speech …, 2016 - ieeexplore.ieee.org

The prediction of symbolic prosodic categories from text is an important, but challenging,
natural-language processing task given the various ways in which an input can be realized …

被引用次数：35 相关文章所有 3 个版本

[PDF] arxiv.org

Deep learning for prominence detection in children's read speech

M Vaidya, K Sabu, P Rao - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org

The detection of perceived prominence in speech has attracted approaches ranging from
the design of knowledge-based linguistic and acoustic features to the automatic feature …

被引用次数：7 相关文章所有 9 个版本

Acoustic and temporal representations in convolutional neural network models of prosodic events

S Stehwien, A Schweitzer, NT Vu - Speech Communication, 2020 - Elsevier

Prosodic events such as pitch accents and phrase boundaries have various acoustic and
temporal correlates that are used as features in machine learning models to automatically …

被引用次数：13 相关文章

[PDF] researchgate.net

3PRO–An unsupervised method for the automatic detection of sentence prominence in speech

S Kakouros, O Räsänen - Speech Communication, 2016 - Elsevier

Automatic detection of prominence in speech has attracted interest in recent years due to its
multiple uses in spoken language applications. However, typical approaches require …

被引用次数：28 相关文章所有 6 个版本

[PDF] arxiv.org

Psst! prosodic speech segmentation with transformers

N Roll, C Graham, S Todd - arXiv preprint arXiv:2302.01984, 2023 - arxiv.org

Self-attention mechanisms have enabled transformers to achieve superhuman-level
performance on many speech-to-text (STT) tasks, yet the challenge of automatic prosodic …

被引用次数：4 相关文章所有 5 个版本

[PDF] isca-archive.org

[PDF][PDF] Improving Prosodic Boundaries Prediction for Mandarin Speech Synthesis by Using Enhanced Embedding Feature and Model Fusion Approach.

Y Zheng, Y Li, Z Wen, X Ding, J Tao - INTERSPEECH, 2016 - isca-archive.org

Hierarchical prosody structure generation is an important but challenging component for
speech synthesis systems. In this paper, we investigate the use of enhanced embedding …

被引用次数：23 相关文章所有 4 个版本

[PDF] arxiv.org

Prosodic event recognition using convolutional neural networks with context information

S Stehwien, NT Vu - arXiv preprint arXiv:1706.00741, 2017 - arxiv.org

This paper demonstrates the potential of convolutional neural networks (CNN) for detecting
and classifying prosodic events on words, specifically pitch accents and phrase boundary …

被引用次数：21 相关文章所有 7 个版本

[PDF] isca-archive.org

[PDF][PDF] Speech, prosody, and machines: Nine challenges for prosody research

A Rosenberg - Proc. Speech Prosody, 2018 - isca-archive.org

Speech technology is becoming commonplace. Traditional telephony based interactive
voice systems have been joined by virtual assistants and navigation systems to create a …

被引用次数：17 相关文章所有 5 个版本