Deep learning-based multimodal emotion recognition from audio, visual, and text modalities: A systematic review of recent advancements and future prospects

S Zhang, Y Yang, C Chen, X Zhang, Q Leng… - Expert Systems with …, 2024 - Elsevier
Emotion recognition has recently attracted extensive interest due to its significant
applications to human–computer interaction. The expression of human emotion depends on …

Current advances and future perspectives of image fusion: A comprehensive review

S Karim, G Tong, J Li, A Qadir, U Farooq, Y Yu - Information Fusion, 2023 - Elsevier
Multiple imaging modalities can be combined to provide more information about the real
world than a single modality alone. Infrared images discriminate targets with respect to their …

Multimodal prompting with missing modalities for visual recognition

YL Lee, YH Tsai, WC Chiu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
In this paper, we tackle two challenges in multimodal learning for visual recognition: 1) when
missing-modality occurs either during training or testing in real-world situations; and 2) when …

Distribution-consistent modal recovering for incomplete multimodal learning

Y Wang, Z Cui, Y Li - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Recovering missed modality is popular in incomplete multimodal learning because it usually
benefits downstream tasks. However, the existing methods often directly estimate missed …

Mer 2023: Multi-label learning, modality robustness, and semi-supervised learning

Z Lian, H Sun, L Sun, K Chen, M Xu, K Wang… - Proceedings of the 31st …, 2023 - dl.acm.org
The first Multimodal Emotion Recognition Challenge (MER 2023) 1 was successfully held at
ACM Multimedia. The challenge focuses on system robustness and consists of three distinct …

GCNet: Graph completion network for incomplete multimodal learning in conversation

Z Lian, L Chen, L Sun, B Liu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Conversations have become a critical data format on social media platforms. Understanding
conversation from emotion, content and other aspects also attracts increasing attention from …

Efficient multimodal transformer with dual-level feature restoration for robust multimodal sentiment analysis

L Sun, Z Lian, B Liu, J Tao - IEEE Transactions on Affective …, 2023 - ieeexplore.ieee.org
With the proliferation of user-generated online videos, Multimodal Sentiment Analysis (MSA)
has attracted increasing attention recently. Despite significant progress, there are still two …

Multimodal distillation for egocentric action recognition

G Radevski, D Grujicic, M Blaschko… - Proceedings of the …, 2023 - openaccess.thecvf.com
The focal point of egocentric video understanding is modelling hand-object interactions.
Standard models, eg CNNs or Vision Transformers, which receive RGB frames as input …

Modality translation-based multimodal sentiment analysis under uncertain missing modalities

Z Liu, B Zhou, D Chu, Y Sun, L Meng - Information Fusion, 2024 - Elsevier
Multimodal sentiment analysis (MSA) with uncertain missing modalities poses a new
challenge in sentiment analysis. To address this problem, efficient MSA models that …

Multimodal representation learning by alternating unimodal adaptation

X Zhang, J Yoon, M Bansal… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Multimodal learning which integrates data from diverse sensory modes plays a pivotal role
in artificial intelligence. However existing multimodal learning methods often struggle with …