[HTML][HTML] Survey on bimodal speech emotion recognition from acoustic and linguistic information fusion
Speech emotion recognition (SER) is traditionally performed using merely acoustic
information. Acoustic features, commonly are extracted per frame, are mapped into emotion …
information. Acoustic features, commonly are extracted per frame, are mapped into emotion …
Dawn of the transformer era in speech emotion recognition: closing the valence gap
J Wagner, A Triantafyllopoulos… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Recent advances in transformer-based architectures have shown promise in several
machine learning tasks. In the audio domain, such architectures have been successfully …
machine learning tasks. In the audio domain, such architectures have been successfully …
Contextual and cross-modal interaction for multi-modal speech emotion recognition
Speech emotion recognition combining linguistic content and audio signals in the dialog is a
challenging task. Nevertheless, previous approaches have failed to explore emotion cues in …
challenging task. Nevertheless, previous approaches have failed to explore emotion cues in …
Fusing asr outputs in joint training for speech emotion recognition
Alongside acoustic information, linguistic features based on speech transcripts have been
proven useful in Speech Emotion Recognition (SER). However, due to the scarcity of …
proven useful in Speech Emotion Recognition (SER). However, due to the scarcity of …
Self-supervised contrastive cross-modality representation learning for spoken question answering
Spoken question answering (SQA) requires fine-grained understanding of both spoken
documents and questions for the optimal answer prediction. In this paper, we propose novel …
documents and questions for the optimal answer prediction. In this paper, we propose novel …
Multimodal emotion recognition with temporal and semantic consistency
Automated multimodal emotion recognition has become an emerging but challenging
research topic in the fields of affective learning and sentiment analysis. The existing works …
research topic in the fields of affective learning and sentiment analysis. The existing works …
A fine-grained modal label-based multi-stage network for multimodal sentiment analysis
Sentiment analysis is a challenging but valuable research topic in affective computing. It can
improve the quality of various real-world applications, including financial market prediction …
improve the quality of various real-world applications, including financial market prediction …
Cross-corpus speech emotion recognition based on few-shot learning and domain adaptation
Within a single speech emotion corpus, deep neural networks have shown decent
performance in speech emotion recognition. However, the performance of the emotion …
performance in speech emotion recognition. However, the performance of the emotion …
Fusion approaches for emotion recognition from speech using acoustic and text-based features
In this paper, we study different approaches for classifying emotions from speech using
acoustic and text-based features. We propose to obtain contextualized word embeddings …
acoustic and text-based features. We propose to obtain contextualized word embeddings …
Speaker-invariant affective representation learning via adversarial training
Representation learning for speech emotion recognition is challenging due to labeled data
sparsity issue and lack of gold-standard references. In addition, there is much variability from …
sparsity issue and lack of gold-standard references. In addition, there is much variability from …