Multimodal sentiment intensity analysis in videos: Facial gestures and verbal messages

Deep learning-based multimodal emotion recognition from audio, visual, and text modalities: A systematic review of recent advancements and future prospects

S Zhang, Y Yang, C Chen, X Zhang, Q Leng… - Expert Systems with …, 2024 - Elsevier

Emotion recognition has recently attracted extensive interest due to its significant
applications to human–computer interaction. The expression of human emotion depends on …

被引用次数：81 相关文章所有 2 个版本

[PDF] arxiv.org

A survey of multimodal deep generative models

M Suzuki, Y Matsuo - Advanced Robotics, 2022 - Taylor & Francis

Multimodal learning is a framework for building models that make predictions based on
different types of modalities. Important challenges in multimodal learning are the inference of …

被引用次数：103 相关文章所有 6 个版本

[PDF] arxiv.org

Dawn of the transformer era in speech emotion recognition: closing the valence gap

J Wagner, A Triantafyllopoulos… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org

Recent advances in transformer-based architectures have shown promise in several
machine learning tasks. In the audio domain, such architectures have been successfully …

被引用次数：317 相关文章所有 8 个版本

[PDF] thecvf.com

Decoupled multimodal distilling for emotion recognition

Y Li, Y Wang, Z Cui - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com

Human multimodal emotion recognition (MER) aims to perceive human emotions via
language, visual and acoustic modalities. Despite the impressive performance of previous …

被引用次数：98 相关文章所有 8 个版本

[PDF] arxiv.org

Improving multimodal fusion with hierarchical mutual information maximization for multimodal sentiment analysis

W Han, H Chen, S Poria - arXiv preprint arXiv:2109.00412, 2021 - arxiv.org

In multimodal sentiment analysis (MSA), the performance of a model highly depends on the
quality of synthesized embeddings. These embeddings are generated from the upstream …

被引用次数：335 相关文章所有 5 个版本

[PDF] github.io

Disentangled representation learning for multimodal emotion recognition

D Yang, S Huang, H Kuang, Y Du… - Proceedings of the 30th …, 2022 - dl.acm.org

Multimodal emotion recognition aims to identify human emotions from text, audio, and visual
modalities. Previous methods either explore correlations between different modalities or …

被引用次数：152 相关文章所有 3 个版本

[PDF] acm.org

Misa: Modality-invariant and-specific representations for multimodal sentiment analysis

D Hazarika, R Zimmermann, S Poria - Proceedings of the 28th ACM …, 2020 - dl.acm.org

Multimodal Sentiment Analysis is an active area of research that leverages multimodal
signals for affective understanding of user-generated videos. The predominant approach …

被引用次数：718 相关文章所有 3 个版本

[HTML] nih.gov

[HTML][HTML] Multimodal transformer for unaligned multimodal language sequences

YHH Tsai, S Bai, PP Liang, JZ Kolter… - Proceedings of the …, 2019 - ncbi.nlm.nih.gov

Human language is often multimodal, which comprehends a mixture of natural language,
facial gestures, and acoustic behaviors. However, two major challenges in modeling such …

被引用次数：1553 相关文章所有 8 个版本

[PDF] arxiv.org

Univl: A unified video and language pre-training model for multimodal understanding and generation

H Luo, L Ji, B Shi, H Huang, N Duan, T Li, J Li… - arXiv preprint arXiv …, 2020 - arxiv.org

With the recent success of the pre-training technique for NLP and image-linguistic tasks,
some video-linguistic pre-training works are gradually developed to improve video-text …

被引用次数：490 相关文章所有 2 个版本

[PDF] aclanthology.org

Multimodal language analysis in the wild: Cmu-mosei dataset and interpretable dynamic fusion graph

AAB Zadeh, PP Liang, S Poria, E Cambria… - Proceedings of the …, 2018 - aclanthology.org

Analyzing human multimodal language is an emerging area of research in NLP. Intrinsically
this language is multimodal (heterogeneous), sequential and asynchronous; it consists of …

被引用次数：1292 相关文章所有 5 个版本