Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …
computer agents with intelligent capabilities such as understanding, reasoning, and learning …
Foundations & Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions
Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …
computer agents with intelligent capabilities such as understanding, reasoning, and learning …
A survey of automatic text summarization: Progress, process and challenges
With the evolution of the Internet and multimedia technology, the amount of text data has
increased exponentially. This text volume is a precious source of information and knowledge …
increased exponentially. This text volume is a precious source of information and knowledge …
An intelligent video analysis method for abnormal event detection in intelligent transportation systems
Intelligent transportation systems pervasively deploy thousands of video cameras. Analyzing
live video streams from these cameras is of significant importance to public safety. As …
live video streams from these cameras is of significant importance to public safety. As …
Intelligent character recognition using fully convolutional neural networks
The recognition of handwritten text is challenging as there are virtually infinite ways a human
can write the same message. Deep learning approaches for handwriting analysis have …
can write the same message. Deep learning approaches for handwriting analysis have …
A long video caption generation algorithm for big video data retrieval
Videos captured by people are often tied to certain important moments of their lives. But with
the era of big data coming, the time required to retrieval and watch can be daunting. In this …
the era of big data coming, the time required to retrieval and watch can be daunting. In this …
Move forward and tell: A progressive generator of video descriptions
We present an efficient framework that can generate a coherent paragraph to describe a
given video. Previous works on video captioning usually focus on video clips. They typically …
given video. Previous works on video captioning usually focus on video clips. They typically …
Multimodal abstractive summarization for how2 videos
In this paper, we study abstractive summarization for open-domain videos. Unlike the
traditional text news summarization, the goal is less to" compress" text information but rather …
traditional text news summarization, the goal is less to" compress" text information but rather …
Multinet++: Multi-stream feature aggregation and geometric loss strategy for multi-task learning
Multi-task learning is commonly used in autonomous driving for solving various visual
perception tasks. It offers significant benefits in terms of both performance and computational …
perception tasks. It offers significant benefits in terms of both performance and computational …
A survey of recent work on video summarization: approaches and techniques
V Tiwari, C Bhatnagar - Multimedia Tools and Applications, 2021 - Springer
The volume of video data generated has seen an exponential growth over the years and
video summarization has emerged as a process that can facilitate efficient storage, quick …
video summarization has emerged as a process that can facilitate efficient storage, quick …