[HTML][HTML] Survey on bimodal speech emotion recognition from acoustic and linguistic information fusion
Speech emotion recognition (SER) is traditionally performed using merely acoustic
information. Acoustic features, commonly are extracted per frame, are mapped into emotion …
information. Acoustic features, commonly are extracted per frame, are mapped into emotion …
A general survey on attention mechanisms in deep learning
G Brauwers, F Frasincar - IEEE Transactions on Knowledge …, 2021 - ieeexplore.ieee.org
Attention is an important mechanism that can be employed for a variety of deep learning
models across many different domains and tasks. This survey provides an overview of the …
models across many different domains and tasks. This survey provides an overview of the …
CTNet: Conversational transformer network for emotion recognition
Emotion recognition in conversation is a crucial topic for its widespread applications in the
field of human-computer interactions. Unlike vanilla emotion recognition of individual …
field of human-computer interactions. Unlike vanilla emotion recognition of individual …
Deep multimodal emotion recognition on human speech: A review
P Koromilas, T Giannakopoulos - Applied Sciences, 2021 - mdpi.com
This work reviews the state of the art in multimodal speech emotion recognition
methodologies, focusing on audio, text and visual information. We provide a new …
methodologies, focusing on audio, text and visual information. We provide a new …
Speech emotion recognition using self-supervised features
Self-supervised pre-trained features have consistently delivered state-of-art results in the
field of natural language processing (NLP); however, their merits in the field of speech …
field of natural language processing (NLP); however, their merits in the field of speech …
Multimodal speech emotion recognition using audio and text
Speech emotion recognition is a challenging task, and extensive reliance has been placed
on models that use audio features in building well-performing classifiers. In this paper, we …
on models that use audio features in building well-performing classifiers. In this paper, we …
Survey of deep representation learning for speech emotion recognition
Traditionally, speech emotion recognition (SER) research has relied on manually
handcrafted acoustic features using feature engineering. However, the design of …
handcrafted acoustic features using feature engineering. However, the design of …
M3er: Multiplicative multimodal emotion recognition using facial, textual, and speech cues
We present M3ER, a learning-based method for emotion recognition from multiple input
modalities. Our approach combines cues from multiple co-occurring modalities (such as …
modalities. Our approach combines cues from multiple co-occurring modalities (such as …
Att-Net: Enhanced emotion recognition system using lightweight self-attention module
S Kwon - Applied Soft Computing, 2021 - Elsevier
Speech emotion recognition (SER) is an active research field of digital signal processing
and plays a crucial role in numerous applications of Human–computer interaction (HCI) …
and plays a crucial role in numerous applications of Human–computer interaction (HCI) …
Efficient speech emotion recognition using multi-scale cnn and attention
Z Peng, Y Lu, S Pan, Y Liu - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
Emotion recognition from speech is a challenging task. Recent advances in deep learning
have led bi-directional recurrent neural network (Bi-RNN) and attention mechanism as a …
have led bi-directional recurrent neural network (Bi-RNN) and attention mechanism as a …