Found in translation: Learning robust joint representations by cyclic translations between...

PP Liang, A Zadeh, LP Morency - arXiv preprint arXiv:2209.03430, 2022 - arxiv.org

Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …

被引用次数：123 相关文章所有 2 个版本

[PDF] arxiv.org

Multimodal co-learning: Challenges, applications with datasets, recent advances and future directions

A Rahate, R Walambe, S Ramanna, K Kotecha - Information Fusion, 2022 - Elsevier

Multimodal deep learning systems that employ multiple modalities like text, image, audio,
video, etc., are showing better performance than individual modalities (ie, unimodal) …

被引用次数：111 相关文章所有 4 个版本

[PDF] cssclab.cn

Multimodal sentiment analysis based on fusion methods: A survey

L Zhu, Z Zhu, C Zhang, Y Xu, X Kong - Information Fusion, 2023 - Elsevier

Sentiment analysis is an emerging technology that aims to explore people's attitudes toward
an entity. It can be applied in a variety of different fields and scenarios, such as product …

被引用次数：90 相关文章所有 5 个版本

[PDF] arxiv.org

Learning audio-visual speech representation by masked multimodal cluster prediction

B Shi, WN Hsu, K Lakhotia, A Mohamed - arXiv preprint arXiv:2201.02184, 2022 - arxiv.org

Video recordings of speech contain correlated audio and visual information, providing a
strong signal for speech representation learning from the speaker's lip movements and the …

被引用次数：234 相关文章所有 3 个版本

[PDF] arxiv.org

Improving multimodal fusion with hierarchical mutual information maximization for multimodal sentiment analysis

W Han, H Chen, S Poria - arXiv preprint arXiv:2109.00412, 2021 - arxiv.org

In multimodal sentiment analysis (MSA), the performance of a model highly depends on the
quality of synthesized embeddings. These embeddings are generated from the upstream …

被引用次数：253 相关文章所有 5 个版本

[PDF] thecvf.com

Are multimodal transformers robust to missing modality?

M Ma, J Ren, L Zhao, D Testuggine… - Proceedings of the …, 2022 - openaccess.thecvf.com

Multimodal data collected from the real world are often imperfect due to missing modalities.
Therefore multimodal models that are robust against modal-incomplete data are highly …

被引用次数：118 相关文章所有 8 个版本

[PDF] github.io

Disentangled representation learning for multimodal emotion recognition

D Yang, S Huang, H Kuang, Y Du… - Proceedings of the 30th …, 2022 - dl.acm.org

Multimodal emotion recognition aims to identify human emotions from text, audio, and visual
modalities. Previous methods either explore correlations between different modalities or …

被引用次数：100 相关文章所有 3 个版本

[PDF] thecvf.com

Decoupled multimodal distilling for emotion recognition

Y Li, Y Wang, Z Cui - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com

Human multimodal emotion recognition (MER) aims to perceive human emotions via
language, visual and acoustic modalities. Despite the impressive performance of previous …

被引用次数：58 相关文章所有 8 个版本

[PDF] acm.org

Misa: Modality-invariant and-specific representations for multimodal sentiment analysis

D Hazarika, R Zimmermann, S Poria - Proceedings of the 28th ACM …, 2020 - dl.acm.org

Multimodal Sentiment Analysis is an active area of research that leverages multimodal
signals for affective understanding of user-generated videos. The predominant approach …

被引用次数：549 相关文章所有 3 个版本

[PDF] thecvf.com

Multimodal prompting with missing modalities for visual recognition

YL Lee, YH Tsai, WC Chiu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

In this paper, we tackle two challenges in multimodal learning for visual recognition: 1) when
missing-modality occurs either during training or testing in real-world situations; and 2) when …

被引用次数：54 相关文章所有 8 个版本