Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …
computer agents with intelligent capabilities such as understanding, reasoning, and learning …
Machine-generated text: A comprehensive survey of threat models and detection methods
Machine-generated text is increasingly difficult to distinguish from text authored by humans.
Powerful open-source models are freely available, and user-friendly tools that democratize …
Powerful open-source models are freely available, and user-friendly tools that democratize …
Video summarization using deep neural networks: A survey
Video summarization technologies aim to create a concise and complete synopsis by
selecting the most informative parts of the video content. Several approaches have been …
selecting the most informative parts of the video content. Several approaches have been …
Align and attend: Multimodal summarization with dual contrastive losses
The goal of multimodal summarization is to extract the most important information from
different modalities to form summaries. Unlike unimodal summarization, the multimodal …
different modalities to form summaries. Unlike unimodal summarization, the multimodal …
Multi-document summarization via deep learning techniques: A survey
Multi-document summarization (MDS) is an effective tool for information aggregation that
generates an informative and concise summary from a cluster of topic-related documents …
generates an informative and concise summary from a cluster of topic-related documents …
How2: a large-scale dataset for multimodal language understanding
In this paper, we introduce How2, a multimodal collection of instructional videos with English
subtitles and crowdsourced Portuguese translations. We also present integrated sequence …
subtitles and crowdsourced Portuguese translations. We also present integrated sequence …
Foundations & trends in multimodal machine learning: Principles, challenges, and open questions
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …
computer agents with intelligent capabilities such as understanding, reasoning, and learning …
MSMO: Multimodal summarization with multimodal output
Multimodal summarization has drawn much attention due to the rapid growth of multimedia
data. The output of the current multimodal summarization systems is usually represented in …
data. The output of the current multimodal summarization systems is usually represented in …
Neural natural language generation: A survey on multilinguality, multimodality, controllability and learning
Developing artificial learning systems that can understand and generate natural language
has been one of the long-standing goals of artificial intelligence. Recent decades have …
has been one of the long-standing goals of artificial intelligence. Recent decades have …