Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions

PP Liang, A Zadeh, LP Morency - arXiv preprint arXiv:2209.03430, 2022 - arxiv.org
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …

Foundations & trends in multimodal machine learning: Principles, challenges, and open questions

PP Liang, A Zadeh, LP Morency - ACM Computing Surveys, 2024 - dl.acm.org
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …

The emerging trends of multi-label learning

W Liu, H Wang, X Shen… - IEEE transactions on …, 2021 - ieeexplore.ieee.org
Exabytes of data are generated daily by humans, leading to the growing needs for new
efforts in dealing with the grand challenges for multi-label learning brought by big data. For …

[HTML][HTML] Multibench: Multiscale benchmarks for multimodal representation learning

PP Liang, Y Lyu, X Fan, Z Wu, Y Cheng… - Advances in neural …, 2021 - ncbi.nlm.nih.gov
Learning multimodal representations involves integrating information from multiple
heterogeneous sources of data. It is a challenging yet crucial area with numerous real-world …

Wider face: A face detection benchmark

S Yang, P Luo, CC Loy, X Tang - Proceedings of the IEEE …, 2016 - openaccess.thecvf.com
Face detection is one of the most studied topics in the computer vision community. Much of
the progresses have been made by the availability of face detection benchmark datasets …

Big data: A survey

M Chen, S Mao, Y Liu - Mobile networks and applications, 2014 - Springer
In this paper, we review the background and state-of-the-art of big data. We first introduce
the general background of big data and review related technologies, such as could …

Learning to separate object sounds by watching unlabeled video

R Gao, R Feris, K Grauman - Proceedings of the European …, 2018 - openaccess.thecvf.com
Perceiving a scene most fully requires all the senses. Yet modeling how objects look and
sound is challenging: most natural scenes and events contain multiple objects, and the …

A survey of knowledge graph reasoning on graph types: Static, dynamic, and multi-modal

K Liang, L Meng, M Liu, Y Liu, W Tu… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Knowledge graph reasoning (KGR), aiming to deduce new facts from existing facts based on
mined logic rules underlying knowledge graphs (KGs), has become a fast-growing research …

Large-scale visual sentiment ontology and detectors using adjective noun pairs

D Borth, R Ji, T Chen, T Breuel, SF Chang - Proceedings of the 21st …, 2013 - dl.acm.org
We address the challenge of sentiment analysis from visual content. In contrast to existing
methods which infer sentiment or emotion directly from visual low-level features, we propose …

Deepsentibank: Visual sentiment concept classification with deep convolutional neural networks

T Chen, D Borth, T Darrell, SF Chang - arXiv preprint arXiv:1410.8586, 2014 - arxiv.org
This paper introduces a visual sentiment concept classification method based on deep
convolutional neural networks (CNNs). The visual sentiment concepts are adjective noun …