Video summarisation: A conceptual framework and survey of the state of the art

AG Money, H Agius - Journal of visual communication and image …, 2008 - Elsevier
Video summaries provide condensed and succinct representations of the content of a video
stream through a combination of still images, video segments, graphical representations and …

A review of text and image retrieval approaches for broadcast news video

R Yan, AG Hauptmann - Information Retrieval, 2007 - Springer
The effectiveness of a video retrieval system largely depends on the choice of underlying
text and image retrieval components. The unique properties of video collections (eg, multiple …

Video2commonsense: Generating commonsense descriptions to enrich video captioning

Z Fang, T Gokhale, P Banerjee, C Baral… - arXiv preprint arXiv …, 2020 - arxiv.org
Captioning is a crucial and challenging task for video understanding. In videos that involve
active agents such as humans, the agent's actions can bring about myriad changes in the …

Multiple pairwise ranking networks for personalized video summarization

Y Saquil, D Chen, Y He, C Li… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
In this paper, we investigate video summarization in the supervised setting. Since video
summarization is subjective to the preference of the end-user, the design of a unique model …

Discovering high quality answers in community question answering archives using a hierarchy of classifiers

H Toba, ZY Ming, M Adriani, TS Chua - Information Sciences, 2014 - Elsevier
In community-based question answering (CQA) services where answers are generated by
human, users may expect better answers than an automatic question answering system …

MultiVENT: Multilingual Videos of Events and Aligned Natural Text

K Sanders, D Etter, R Kriz… - Advances in Neural …, 2023 - proceedings.neurips.cc
Everyday news coverage has shifted from traditional broadcasts towards a wide range of
presentation formats such as first-hand, unedited video footage. Datasets that reflect the …

Dynamic graph representation learning for video dialog via multi-modal shuffled transformers

S Geng, P Gao, M Chatterjee, C Hori… - Proceedings of the …, 2021 - ojs.aaai.org
Given an input video, its associated audio, and a brief caption, the audio-visual scene aware
dialog (AVSD) task requires an agent to indulge in a question-answer dialog with a human …

A novel framework for semantic annotation and personalized retrieval of sports video

C Xu, J Wang, H Lu, Y Zhang - IEEE transactions on …, 2008 - ieeexplore.ieee.org
Sports video annotation is important for sports video semantic analysis such as event
detection and personalization. In this paper, we propose a novel approach for sports video …

A hybrid watermarking scheme for H. 264/AVC video

G Qiu, P Marziliano, ATS Ho, D He… - Proceedings of the 17th …, 2004 - ieeexplore.ieee.org
A novel H. 264/AVC watermarking method is proposed in this paper. By embedding the
robust watermark into DCT domain and the fragile watermark into motion vectors …

Focal visual-text attention for memex question answering

J Liang, L Jiang, L Cao, Y Kalantidis… - IEEE transactions on …, 2019 - ieeexplore.ieee.org
Recent insights on language and vision with neural networks have been successfully
applied to simple single-image visual question answering. However, to tackle real-life …