[HTML][HTML] Survey on bimodal speech emotion recognition from acoustic and linguistic information fusion

BT Atmaja, A Sasou, M Akagi - Speech Communication, 2022 - Elsevier
Speech emotion recognition (SER) is traditionally performed using merely acoustic
information. Acoustic features, commonly are extracted per frame, are mapped into emotion …

A review on methods and applications in multimodal deep learning

S Jabeen, X Li, MS Amin, O Bourahla, S Li… - ACM Transactions on …, 2023 - dl.acm.org
Deep Learning has implemented a wide range of applications and has become increasingly
popular in recent years. The goal of multimodal deep learning (MMDL) is to create models …

[HTML][HTML] The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North …

SR Livingstone, FA Russo - PloS one, 2018 - journals.plos.org
The RAVDESS is a validated multimodal database of emotional speech and song. The
database is gender balanced consisting of 24 professional actors, vocalizing lexically …

[HTML][HTML] Emotional voice conversion: Theory, databases and ESD

K Zhou, B Sisman, R Liu, H Li - Speech Communication, 2022 - Elsevier
In this paper, we first provide a review of the state-of-the-art emotional voice conversion
research, and the existing emotional speech databases. We then motivate the development …

Survey of deep representation learning for speech emotion recognition

S Latif, R Rana, S Khalifa, R Jurdak… - IEEE Transactions …, 2021 - ieeexplore.ieee.org
Traditionally, speech emotion recognition (SER) research has relied on manually
handcrafted acoustic features using feature engineering. However, the design of …

Missing modality imagination network for emotion recognition with uncertain missing modalities

J Zhao, R Li, Q Jin - Proceedings of the 59th Annual Meeting of …, 2021 - aclanthology.org
Multimodal fusion has been proved to improve emotion recognition performance in previous
works. However, in real-world applications, we often encounter the problem of missing …

Building naturalistic emotionally balanced speech corpus by retrieving emotional speech from existing podcast recordings

R Lotfian, C Busso - IEEE Transactions on Affective Computing, 2017 - ieeexplore.ieee.org
The lack of a large, natural emotional database is one of the key barriers to translate results
on speech emotion recognition in controlled conditions into real-life applications. Collecting …

Automated accurate speech emotion recognition system using twine shuffle pattern and iterative neighborhood component analysis techniques

T Tuncer, S Dogan, UR Acharya - Knowledge-Based Systems, 2021 - Elsevier
Speech emotion recognition is one of the challenging research issues in the knowledge-
based system and various methods have been recommended to reach high classification …

[HTML][HTML] K-EmoCon, a multimodal sensor dataset for continuous emotion recognition in naturalistic conversations

CY Park, N Cha, S Kang, A Kim, AH Khandoker… - Scientific Data, 2020 - nature.com
Recognizing emotions during social interactions has many potential applications with the
popularization of low-cost mobile sensors, but a challenge remains with the lack of …

Multimodal Emotion Recognition with deep learning: advancements, challenges, and future directions

AV Geetha, T Mala, D Priyanka, E Uma - Information Fusion, 2024 - Elsevier
In recent years, affective computing has become a topic of considerable interest, driven by
its ability to enhance several domains, such as mental health monitoring, human–computer …