A review on speech emotion recognition using deep learning and attention mechanism

E Lieskovská, M Jakubec, R Jarina, M Chmulík - Electronics, 2021 - mdpi.com
Emotions are an integral part of human interactions and are significant factors in determining
user satisfaction or customer opinion. speech emotion recognition (SER) modules also play …

Modeling Speech Emotion Recognition via Attention-Oriented Parallel CNN Encoders

F Makhmudov, A Kutlimuratov, F Akhmedov… - Electronics, 2022 - mdpi.com
Meticulous learning of human emotions through speech is an indispensable function of
modern speech emotion recognition (SER) models. Consequently, deriving and interpreting …

Predicting expressive speaking style from text in end-to-end speech synthesis

D Stanton, Y Wang… - 2018 IEEE Spoken …, 2018 - ieeexplore.ieee.org
Global Style Tokens (GSTs) are a recently-proposed method to learn latent disentangled
representations of high-dimensional data. GSTs can be used within Tacotron, a state-of-the …

Transformer encoder with multi-modal multi-head attention for continuous affect recognition

H Chen, D Jiang, H Sahli - IEEE Transactions on Multimedia, 2020 - ieeexplore.ieee.org
Continuous affect recognition is becoming an increasingly attractive research topic in
affective computing. Previous works mainly focused on modelling the temporal dependency …

Improving cross-corpus speech emotion recognition with adversarial discriminative domain generalization (ADDoG)

J Gideon, MG McInnis… - IEEE Transactions on …, 2019 - ieeexplore.ieee.org
Automatic speech emotion recognition provides computers with critical context to enable
user understanding. While methods trained and tested within the same dataset have been …

Few-shot learning for fine-grained emotion recognition using physiological signals

T Zhang, A El Ali, A Hanjalic… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Fine-grained emotion recognition can model the temporal dynamics of emotions, which is
more precise than predicting one emotion retrospectively for an activity (eg, video clip …

Learning speech emotion representations in the quaternion domain

E Guizzo, T Weyde, S Scardapane… - … /ACM Transactions on …, 2023 - ieeexplore.ieee.org
The modeling of human emotion expression in speech signals is an important, yet
challenging task. The high resource demand of speech emotion recognition models …

The priori emotion dataset: Linking mood to emotion detected in-the-wild

S Khorram, M Jaiswal, J Gideon, M McInnis… - arXiv preprint arXiv …, 2018 - arxiv.org
Bipolar Disorder is a chronic psychiatric illness characterized by pathological mood swings
associated with severe disruptions in emotion regulation. Clinical monitoring of mood is key …

[PDF][PDF] Convolutional Neural Network Based Speaker De-Identification.

F Bahmaninezhad, C Zhang, JHL Hansen - Odyssey, 2018 - isca-archive.org
Concealing speaker identity in speech signals refers to the task of speaker de-identification,
which helps protect the privacy of a speaker. Although, both linguistic and paralinguistic …

Continuous emotion recognition in videos by fusing facial expression, head pose and eye gaze

S Wu, Z Du, W Li, D Huang, Y Wang - 2019 International conference on …, 2019 - dl.acm.org
Continuous emotion recognition is of great significance in affective computing and human-
computer interaction. Most of existing methods for video based continuous emotion …