Emotion recognition in speech using cross-modal transfer in the wild

S Albanie, A Nagrani, A Vedaldi… - Proceedings of the 26th …, 2018 - dl.acm.org
Obtaining large, human labelled speech datasets to train models for emotion recognition is a
notoriously challenging task, hindered by annotation cost and label ambiguity. In this work …

Multi-task semi-supervised adversarial autoencoding for speech emotion recognition

S Latif, R Rana, S Khalifa, R Jurdak… - IEEE Transactions …, 2020 - ieeexplore.ieee.org
Inspite the emerging importance of Speech Emotion Recognition (SER), the state-of-the-art
accuracy is quite low and needs improvement to make commercial applications of SER …

Multimodal and temporal perception of audio-visual cues for emotion recognition

E Ghaleb, M Popa, S Asteriadis - 2019 8th international …, 2019 - ieeexplore.ieee.org
In Audio-Video Emotion Recognition (AVER), the idea is to have a human-level
understanding of emotions from video clips. There is a need to bring these two modalities …

Using regional saliency for speech emotion recognition

Z Aldeneh, EM Provost - 2017 IEEE international conference on …, 2017 - ieeexplore.ieee.org
In this paper, we show that convolutional neural networks can be directly applied to temporal
low-level acoustic features to identify emotionally salient regions without the need for …

The ordinal nature of emotions: An emerging approach

GN Yannakakis, R Cowie… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
Computational representation of everyday emotional states is a challenging task and,
arguably, one of the most fundamental for affective computing. Standard practice in emotion …

Progressive neural networks for transfer learning in emotion recognition

J Gideon, S Khorram, Z Aldeneh, D Dimitriadis… - arXiv preprint arXiv …, 2017 - arxiv.org
Many paralinguistic tasks are closely related and thus representations learned in one
domain can be leveraged for another. In this paper, we investigate how knowledge can be …

Chunk-level speech emotion recognition: A general framework of sequence-to-one dynamic temporal modeling

WC Lin, C Busso - IEEE Transactions on Affective Computing, 2021 - ieeexplore.ieee.org
A critical issue of current speech-based sequence-to-one learning tasks, such as speech
emotion recognition (SER), is the dynamic temporal modeling for speech sentences with …

Multitask learning from augmented auxiliary data for improving speech emotion recognition

S Latif, R Rana, S Khalifa, R Jurdak… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Despite the recent progress in speech emotion recognition (SER), state-of-the-art systems
lack generalisation across different conditions. A key underlying reason for poor …

Can large language models aid in annotating speech emotional data? uncovering new frontiers

S Latif, M Usama, MI Malik, BW Schuller - arXiv preprint arXiv:2307.06090, 2023 - arxiv.org
Despite recent advancements in speech emotion recognition (SER) models, state-of-the-art
deep learning (DL) approaches face the challenge of the limited availability of annotated …

Multimodal attention-mechanism for temporal emotion recognition

E Ghaleb, J Niehues… - 2020 IEEE International …, 2020 - ieeexplore.ieee.org
Exploiting the multimodal and temporal interaction between audio-visual channels is
essential for automatic audio-video emotion recognition (AVER). Modalities' strength in …