A review on speech emotion recognition using deep learning and attention mechanism

E Lieskovská, M Jakubec, R Jarina, M Chmulík - Electronics, 2021 - mdpi.com
Emotions are an integral part of human interactions and are significant factors in determining
user satisfaction or customer opinion. speech emotion recognition (SER) modules also play …

Deep representation learning in speech processing: Challenges, recent advances, and future trends

S Latif, R Rana, S Khalifa, R Jurdak, J Qadir… - arXiv preprint arXiv …, 2020 - arxiv.org
Research on speech processing has traditionally considered the task of designing hand-
engineered acoustic features (feature engineering) as a separate distinct problem from the …

A survey of the state of explainable AI for natural language processing

M Danilevsky, K Qian, R Aharonov, Y Katsis… - arXiv preprint arXiv …, 2020 - arxiv.org
Recent years have seen important advances in the quality of state-of-the-art models, but this
has come at the expense of models becoming less interpretable. This survey presents an …

Deep learning for human affect recognition: Insights and new developments

PV Rouast, MTP Adam, R Chiong - IEEE Transactions on …, 2019 - ieeexplore.ieee.org
Automatic human affect recognition is a key step towards more natural human-computer
interaction. Recent trends include recognition in the wild using a fusion of audiovisual and …

Multimodal speech emotion recognition using audio and text

S Yoon, S Byun, K Jung - 2018 IEEE spoken language …, 2018 - ieeexplore.ieee.org
Speech emotion recognition is a challenging task, and extensive reliance has been placed
on models that use audio features in building well-performing classifiers. In this paper, we …

Survey of deep representation learning for speech emotion recognition

S Latif, R Rana, S Khalifa, R Jurdak… - IEEE Transactions …, 2021 - ieeexplore.ieee.org
Traditionally, speech emotion recognition (SER) research has relied on manually
handcrafted acoustic features using feature engineering. However, the design of …

Emotion recognition in speech using cross-modal transfer in the wild

S Albanie, A Nagrani, A Vedaldi… - Proceedings of the 26th …, 2018 - dl.acm.org
Obtaining large, human labelled speech datasets to train models for emotion recognition is a
notoriously challenging task, hindered by annotation cost and label ambiguity. In this work …

[PDF][PDF] Data Augmentation Using GANs for Speech Emotion Recognition.

A Chatziagapi, G Paraskevopoulos, D Sgouropoulos… - Interspeech, 2019 - slp-ntua.github.io
In this work, we address the problem of data imbalance for the task of Speech Emotion
Recognition (SER). We investigate conditioned data augmentation using Generative …

An attention pooling based representation learning method for speech emotion recognition

P Li, Y Song, IV McLoughlin, W Guo, LR Dai - 2018 - kar.kent.ac.uk
This paper proposes an attention pooling based representation learning method for speech
emotion recognition (SER). The emotional representation is learned in an end-to-end …

Attention based fully convolutional network for speech emotion recognition

Y Zhang, J Du, Z Wang, J Zhang… - 2018 Asia-Pacific Signal …, 2018 - ieeexplore.ieee.org
Speech emotion recognition is a challenging task for three main reasons: 1) human emotion
is abstract, which means it is hard to distinguish; 2) in general, human emotion can only be …