Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …
computer agents with intelligent capabilities such as understanding, reasoning, and learning …
Foundations & trends in multimodal machine learning: Principles, challenges, and open questions
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …
computer agents with intelligent capabilities such as understanding, reasoning, and learning …
The emerging trends of multi-label learning
Exabytes of data are generated daily by humans, leading to the growing needs for new
efforts in dealing with the grand challenges for multi-label learning brought by big data. For …
efforts in dealing with the grand challenges for multi-label learning brought by big data. For …
[HTML][HTML] Multibench: Multiscale benchmarks for multimodal representation learning
Learning multimodal representations involves integrating information from multiple
heterogeneous sources of data. It is a challenging yet crucial area with numerous real-world …
heterogeneous sources of data. It is a challenging yet crucial area with numerous real-world …
Wider face: A face detection benchmark
Face detection is one of the most studied topics in the computer vision community. Much of
the progresses have been made by the availability of face detection benchmark datasets …
the progresses have been made by the availability of face detection benchmark datasets …
Big data: A survey
In this paper, we review the background and state-of-the-art of big data. We first introduce
the general background of big data and review related technologies, such as could …
the general background of big data and review related technologies, such as could …
Learning to separate object sounds by watching unlabeled video
Perceiving a scene most fully requires all the senses. Yet modeling how objects look and
sound is challenging: most natural scenes and events contain multiple objects, and the …
sound is challenging: most natural scenes and events contain multiple objects, and the …
A survey of knowledge graph reasoning on graph types: Static, dynamic, and multi-modal
Knowledge graph reasoning (KGR), aiming to deduce new facts from existing facts based on
mined logic rules underlying knowledge graphs (KGs), has become a fast-growing research …
mined logic rules underlying knowledge graphs (KGs), has become a fast-growing research …
Large-scale visual sentiment ontology and detectors using adjective noun pairs
We address the challenge of sentiment analysis from visual content. In contrast to existing
methods which infer sentiment or emotion directly from visual low-level features, we propose …
methods which infer sentiment or emotion directly from visual low-level features, we propose …
Deepsentibank: Visual sentiment concept classification with deep convolutional neural networks
This paper introduces a visual sentiment concept classification method based on deep
convolutional neural networks (CNNs). The visual sentiment concepts are adjective noun …
convolutional neural networks (CNNs). The visual sentiment concepts are adjective noun …