Emotion recognition in speech using cross-modal transfer in the wild
Obtaining large, human labelled speech datasets to train models for emotion recognition is a
notoriously challenging task, hindered by annotation cost and label ambiguity. In this work …
notoriously challenging task, hindered by annotation cost and label ambiguity. In this work …
Multi-task semi-supervised adversarial autoencoding for speech emotion recognition
Inspite the emerging importance of Speech Emotion Recognition (SER), the state-of-the-art
accuracy is quite low and needs improvement to make commercial applications of SER …
accuracy is quite low and needs improvement to make commercial applications of SER …
Multimodal and temporal perception of audio-visual cues for emotion recognition
In Audio-Video Emotion Recognition (AVER), the idea is to have a human-level
understanding of emotions from video clips. There is a need to bring these two modalities …
understanding of emotions from video clips. There is a need to bring these two modalities …
Using regional saliency for speech emotion recognition
Z Aldeneh, EM Provost - 2017 IEEE international conference on …, 2017 - ieeexplore.ieee.org
In this paper, we show that convolutional neural networks can be directly applied to temporal
low-level acoustic features to identify emotionally salient regions without the need for …
low-level acoustic features to identify emotionally salient regions without the need for …
The ordinal nature of emotions: An emerging approach
GN Yannakakis, R Cowie… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
Computational representation of everyday emotional states is a challenging task and,
arguably, one of the most fundamental for affective computing. Standard practice in emotion …
arguably, one of the most fundamental for affective computing. Standard practice in emotion …
Progressive neural networks for transfer learning in emotion recognition
Many paralinguistic tasks are closely related and thus representations learned in one
domain can be leveraged for another. In this paper, we investigate how knowledge can be …
domain can be leveraged for another. In this paper, we investigate how knowledge can be …
Chunk-level speech emotion recognition: A general framework of sequence-to-one dynamic temporal modeling
A critical issue of current speech-based sequence-to-one learning tasks, such as speech
emotion recognition (SER), is the dynamic temporal modeling for speech sentences with …
emotion recognition (SER), is the dynamic temporal modeling for speech sentences with …
Multitask learning from augmented auxiliary data for improving speech emotion recognition
Despite the recent progress in speech emotion recognition (SER), state-of-the-art systems
lack generalisation across different conditions. A key underlying reason for poor …
lack generalisation across different conditions. A key underlying reason for poor …
Can large language models aid in annotating speech emotional data? uncovering new frontiers
Despite recent advancements in speech emotion recognition (SER) models, state-of-the-art
deep learning (DL) approaches face the challenge of the limited availability of annotated …
deep learning (DL) approaches face the challenge of the limited availability of annotated …
Multimodal attention-mechanism for temporal emotion recognition
Exploiting the multimodal and temporal interaction between audio-visual channels is
essential for automatic audio-video emotion recognition (AVER). Modalities' strength in …
essential for automatic audio-video emotion recognition (AVER). Modalities' strength in …