Semantic text summarization of long videos

PP Liang, A Zadeh, LP Morency - arXiv preprint arXiv:2209.03430, 2022 - arxiv.org

Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …

被引用次数：119 相关文章所有 2 个版本

[PDF] acm.org

Foundations & Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions

PP Liang, A Zadeh, LP Morency - ACM Computing Surveys, 2023 - dl.acm.org

Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …

被引用次数：9 相关文章

[PDF] ieee.org

A survey of automatic text summarization: Progress, process and challenges

MF Mridha, AA Lima, K Nur, SC Das, M Hasan… - IEEE …, 2021 - ieeexplore.ieee.org

With the evolution of the Internet and multimedia technology, the amount of text data has
increased exponentially. This text volume is a precious source of information and knowledge …

被引用次数：84 相关文章所有 3 个版本

An intelligent video analysis method for abnormal event detection in intelligent transportation systems

S Wan, X Xu, T Wang, Z Gu - IEEE Transactions on Intelligent …, 2020 - ieeexplore.ieee.org

Intelligent transportation systems pervasively deploy thousands of video cameras. Analyzing
live video streams from these cameras is of significant importance to public safety. As …

被引用次数：127 相关文章所有 5 个版本

Intelligent character recognition using fully convolutional neural networks

R Ptucha, FP Such, S Pillai, F Brockler, V Singh… - Pattern recognition, 2019 - Elsevier

The recognition of handwritten text is challenging as there are virtually infinite ways a human
can write the same message. Deep learning approaches for handwriting analysis have …

被引用次数：175 相关文章所有 3 个版本

A long video caption generation algorithm for big video data retrieval

S Ding, S Qu, Y Xi, S Wan - Future Generation Computer Systems, 2019 - Elsevier

Videos captured by people are often tied to certain important moments of their lives. But with
the era of big data coming, the time required to retrieval and watch can be daunting. In this …

被引用次数：133 相关文章所有 2 个版本

[PDF] thecvf.com

Move forward and tell: A progressive generator of video descriptions

Y Xiong, B Dai, D Lin - Proceedings of the European …, 2018 - openaccess.thecvf.com

We present an efficient framework that can generate a coherent paragraph to describe a
given video. Previous works on video captioning usually focus on video clips. They typically …

被引用次数：128 相关文章所有 8 个版本

[PDF] arxiv.org

Multimodal abstractive summarization for how2 videos

S Palaskar, J Libovický, S Gella, F Metze - arXiv preprint arXiv:1906.07901, 2019 - arxiv.org

In this paper, we study abstractive summarization for open-domain videos. Unlike the
traditional text news summarization, the goal is less to" compress" text information but rather …

被引用次数：105 相关文章所有 6 个版本

[PDF] thecvf.com

Multinet++: Multi-stream feature aggregation and geometric loss strategy for multi-task learning

S Chennupati, G Sistu, S Yogamani… - Proceedings of the …, 2019 - openaccess.thecvf.com

Multi-task learning is commonly used in autonomous driving for solving various visual
perception tasks. It offers significant benefits in terms of both performance and computational …

被引用次数：101 相关文章所有 10 个版本

A survey of recent work on video summarization: approaches and techniques

V Tiwari, C Bhatnagar - Multimedia Tools and Applications, 2021 - Springer

The volume of video data generated has seen an exponential growth over the years and
video summarization has emerged as a process that can facilitate efficient storage, quick …

被引用次数：42 相关文章所有 5 个版本