Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions

PP Liang, A Zadeh, LP Morency - arXiv preprint arXiv:2209.03430, 2022 - arxiv.org
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …

Machine-generated text: A comprehensive survey of threat models and detection methods

EN Crothers, N Japkowicz, HL Viktor - IEEE Access, 2023 - ieeexplore.ieee.org
Machine-generated text is increasingly difficult to distinguish from text authored by humans.
Powerful open-source models are freely available, and user-friendly tools that democratize …

Video summarization using deep neural networks: A survey

E Apostolidis, E Adamantidou, AI Metsai… - Proceedings of the …, 2021 - ieeexplore.ieee.org
Video summarization technologies aim to create a concise and complete synopsis by
selecting the most informative parts of the video content. Several approaches have been …

Align and attend: Multimodal summarization with dual contrastive losses

B He, J Wang, J Qiu, T Bui… - Proceedings of the …, 2023 - openaccess.thecvf.com
The goal of multimodal summarization is to extract the most important information from
different modalities to form summaries. Unlike unimodal summarization, the multimodal …

Multi-document summarization via deep learning techniques: A survey

C Ma, WE Zhang, M Guo, H Wang, QZ Sheng - ACM Computing Surveys, 2022 - dl.acm.org
Multi-document summarization (MDS) is an effective tool for information aggregation that
generates an informative and concise summary from a cluster of topic-related documents …

How2: a large-scale dataset for multimodal language understanding

R Sanabria, O Caglayan, S Palaskar, D Elliott… - arXiv preprint arXiv …, 2018 - arxiv.org
In this paper, we introduce How2, a multimodal collection of instructional videos with English
subtitles and crowdsourced Portuguese translations. We also present integrated sequence …

Foundations & trends in multimodal machine learning: Principles, challenges, and open questions

PP Liang, A Zadeh, LP Morency - ACM Computing Surveys, 2024 - dl.acm.org
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …

[图书][B] Text data mining

C Zong, R Xia, J Zhang - 2021 - Springer
With the rapid development and popularization of Internet and mobile communication
technologies, text data mining has attracted much attention. In particular, with the wide use …

MSMO: Multimodal summarization with multimodal output

J Zhu, H Li, T Liu, Y Zhou, J Zhang… - Proceedings of the 2018 …, 2018 - aclanthology.org
Multimodal summarization has drawn much attention due to the rapid growth of multimedia
data. The output of the current multimodal summarization systems is usually represented in …

Neural natural language generation: A survey on multilinguality, multimodality, controllability and learning

E Erdem, M Kuyu, S Yagcioglu, A Frank… - Journal of Artificial …, 2022 - jair.org
Developing artificial learning systems that can understand and generate natural language
has been one of the long-standing goals of artificial intelligence. Recent decades have …