Multimodal cross-and self-attention network for speech emotion recognition

H Zou, Y Si, C Chen, D Rajan… - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org

Speech Emotion Recognition (SER) aims to help the machine to understand human's
subjective emotion from only audio in-formation. However, extracting and utilizing …

被引用次数：109 相关文章所有 3 个版本

[HTML] acm.org

Deep Multimodal Data Fusion

F Zhao, C Zhang, B Geng - ACM Computing Surveys, 2024 - dl.acm.org

Multimodal Artificial Intelligence (Multimodal AI), in general, involves various types of data
(eg, images, texts, or data collected from different sensors), feature engineering (eg …

被引用次数：5 相关文章

[PDF] arxiv.org

Efficient multimodal transformer with dual-level feature restoration for robust multimodal sentiment analysis

L Sun, Z Lian, B Liu, J Tao - IEEE Transactions on Affective …, 2023 - ieeexplore.ieee.org

With the proliferation of user-generated online videos, Multimodal Sentiment Analysis (MSA)
has attracted increasing attention recently. Despite significant progress, there are still two …

被引用次数：50 相关文章所有 6 个版本

[PDF] github.io

Contextual and cross-modal interaction for multi-modal speech emotion recognition

D Yang, S Huang, Y Liu, L Zhang - IEEE Signal Processing …, 2022 - ieeexplore.ieee.org

Speech emotion recognition combining linguistic content and audio signals in the dialog is a
challenging task. Nevertheless, previous approaches have failed to explore emotion cues in …

被引用次数：40 相关文章所有 3 个版本

[PDF] arxiv.org

Multimodal fusion on low-quality data: A comprehensive survey

Q Zhang, Y Wei, Z Han, H Fu, X Peng, C Deng… - arXiv preprint arXiv …, 2024 - arxiv.org

Multimodal fusion focuses on integrating information from multiple modalities with the goal of
more accurate prediction, which has achieved remarkable progress in a wide range of …

被引用次数：4 相关文章所有 2 个版本

[PDF] sciencedirect.com

Multimodal emotion recognition using cross modal audio-video fusion with attention and deep metric learning

B Mocanu, R Tapu, T Zaharia - Image and Vision Computing, 2023 - Elsevier

In the last few years, the multi-modal emotion recognition has become an important research
issue in the affective computing community due to its wide range of applications that include …

被引用次数：26 相关文章所有 4 个版本

[PDF] ieee.org

Attention-based multi-learning approach for speech emotion recognition with dilated convolution

S Kakuba, A Poulose, DS Han - IEEE Access, 2022 - ieeexplore.ieee.org

The success of deep learning in speech emotion recognition has led to its application in
resource-constrained devices. It has been applied in human-to-machine interaction …

被引用次数：24 相关文章所有 6 个版本

[PDF] arxiv.org

Is cross-attention preferable to self-attention for multi-modal emotion recognition?

V Rajan, A Brutti, A Cavallaro - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org

Humans express their emotions via facial expressions, voice intonation and word choices.
To infer the nature of the underlying emotion, recognition models may use a single modality …

被引用次数：33 相关文章所有 4 个版本

AMPS: Predicting popularity of short-form videos using multi-modal attention mechanisms in social media marketing environments

M Cho, D Jeong, E Park - Journal of Retailing and Consumer Services, 2024 - Elsevier

Emerging as a dominant content format amid the shift from television to mobile, short-form
videos wield immense potential across diverse domains. However, the scarcity of datasets …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

A cross-modal fusion network based on self-attention and residual structure for multimodal emotion recognition

Z Fu, F Liu, H Wang, J Qi, X Fu, A Zhou, Z Li - arXiv preprint arXiv …, 2021 - arxiv.org

The audio-video based multimodal emotion recognition has attracted a lot of attention due to
its robust performance. Most of the existing methods focus on proposing different cross …

被引用次数：31 相关文章所有 2 个版本