Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers

MB Akçay, K Oğuz - Speech Communication, 2020 - Elsevier
Speech is the most natural way of expressing ourselves as humans. It is only natural then to
extend this communication medium to computer applications. We define speech emotion …

[HTML][HTML] A review on speech emotion recognition using deep learning and attention mechanism

E Lieskovská, M Jakubec, R Jarina, M Chmulík - Electronics, 2021 - mdpi.com
Emotions are an integral part of human interactions and are significant factors in determining
user satisfaction or customer opinion. speech emotion recognition (SER) modules also play …

Ast: Audio spectrogram transformer

Y Gong, YA Chung, J Glass - arXiv preprint arXiv:2104.01778, 2021 - arxiv.org
In the past decade, convolutional neural networks (CNNs) have been widely adopted as the
main building block for end-to-end audio classification models, which aim to learn a direct …

A fine-tuned wav2vec 2.0/hubert benchmark for speech emotion recognition, speaker verification and spoken language understanding

Y Wang, A Boumadane, A Heba - arXiv preprint arXiv:2111.02735, 2021 - arxiv.org
Speech self-supervised models such as wav2vec 2.0 and HuBERT are making revolutionary
progress in Automatic Speech Recognition (ASR). However, they have not been totally …

Speech emotion recognition with co-attention based multi-level acoustic information

H Zou, Y Si, C Chen, D Rajan… - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
Speech Emotion Recognition (SER) aims to help the machine to understand human's
subjective emotion from only audio in-formation. However, extracting and utilizing …

[HTML][HTML] Two-way feature extraction for speech emotion recognition using deep learning

A Aggarwal, A Srivastava, A Agarwal, N Chahal… - Sensors, 2022 - mdpi.com
Recognizing human emotions by machines is a complex task. Deep learning models
attempt to automate this process by rendering machines to exhibit learning capabilities …

Speech emotion recognition using recurrent neural networks with directional self-attention

D Li, J Liu, Z Yang, L Sun, Z Wang - Expert Systems with Applications, 2021 - Elsevier
As an important branch of affective computing, Speech Emotion Recognition (SER) plays a
vital role in human–computer interaction. In order to mine the relevance of signals in audios …

A survey of speech emotion recognition in natural environment

MS Fahad, A Ranjan, J Yadav, A Deepak - Digital signal processing, 2021 - Elsevier
While speech emotion recognition (SER) has been an active research field since the last
three decades, the techniques that deal with the natural environment have only emerged in …

Learning alignment for multimodal emotion recognition from speech

H Xu, H Zhang, K Han, Y Wang, Y Peng, X Li - arXiv preprint arXiv …, 2019 - arxiv.org
Speech emotion recognition is a challenging problem because human convey emotions in
subtle and complex ways. For emotion recognition on human speech, one can either extract …

Head fusion: Improving the accuracy and robustness of speech emotion recognition on the IEMOCAP and RAVDESS dataset

M Xu, F Zhang, W Zhang - IEEE Access, 2021 - ieeexplore.ieee.org
Speech Emotion Recognition (SER) refers to the use of machines to recognize the emotions
of a speaker from his (or her) speech. SER benefits Human-Computer Interaction (HCI). But …