[HTML][HTML] Survey on bimodal speech emotion recognition from acoustic and linguistic information fusion

BT Atmaja, A Sasou, M Akagi - Speech Communication, 2022 - Elsevier
Speech emotion recognition (SER) is traditionally performed using merely acoustic
information. Acoustic features, commonly are extracted per frame, are mapped into emotion …

A general survey on attention mechanisms in deep learning

G Brauwers, F Frasincar - IEEE Transactions on Knowledge …, 2021 - ieeexplore.ieee.org
Attention is an important mechanism that can be employed for a variety of deep learning
models across many different domains and tasks. This survey provides an overview of the …

CTNet: Conversational transformer network for emotion recognition

Z Lian, B Liu, J Tao - IEEE/ACM Transactions on Audio, Speech …, 2021 - ieeexplore.ieee.org
Emotion recognition in conversation is a crucial topic for its widespread applications in the
field of human-computer interactions. Unlike vanilla emotion recognition of individual …

Deep multimodal emotion recognition on human speech: A review

P Koromilas, T Giannakopoulos - Applied Sciences, 2021 - mdpi.com
This work reviews the state of the art in multimodal speech emotion recognition
methodologies, focusing on audio, text and visual information. We provide a new …

Speech emotion recognition using self-supervised features

E Morais, R Hoory, W Zhu, I Gat… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
Self-supervised pre-trained features have consistently delivered state-of-art results in the
field of natural language processing (NLP); however, their merits in the field of speech …

Multimodal speech emotion recognition using audio and text

S Yoon, S Byun, K Jung - 2018 IEEE spoken language …, 2018 - ieeexplore.ieee.org
Speech emotion recognition is a challenging task, and extensive reliance has been placed
on models that use audio features in building well-performing classifiers. In this paper, we …

Survey of deep representation learning for speech emotion recognition

S Latif, R Rana, S Khalifa, R Jurdak… - IEEE Transactions …, 2021 - ieeexplore.ieee.org
Traditionally, speech emotion recognition (SER) research has relied on manually
handcrafted acoustic features using feature engineering. However, the design of …

M3er: Multiplicative multimodal emotion recognition using facial, textual, and speech cues

T Mittal, U Bhattacharya, R Chandra, A Bera… - Proceedings of the AAAI …, 2020 - aaai.org
We present M3ER, a learning-based method for emotion recognition from multiple input
modalities. Our approach combines cues from multiple co-occurring modalities (such as …

Att-Net: Enhanced emotion recognition system using lightweight self-attention module

S Kwon - Applied Soft Computing, 2021 - Elsevier
Speech emotion recognition (SER) is an active research field of digital signal processing
and plays a crucial role in numerous applications of Human–computer interaction (HCI) …

Efficient speech emotion recognition using multi-scale cnn and attention

Z Peng, Y Lu, S Pan, Y Liu - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
Emotion recognition from speech is a challenging task. Recent advances in deep learning
have led bi-directional recurrent neural network (Bi-RNN) and attention mechanism as a …