VideoQA: question answering on news video

AG Money, H Agius - Journal of visual communication and image …, 2008 - Elsevier

Video summaries provide condensed and succinct representations of the content of a video
stream through a combination of still images, video segments, graphical representations and …

被引用次数：545 相关文章所有 8 个版本

[PDF] inaoep.mx

A review of text and image retrieval approaches for broadcast news video

R Yan, AG Hauptmann - Information Retrieval, 2007 - Springer

The effectiveness of a video retrieval system largely depends on the choice of underlying
text and image retrieval components. The unique properties of video collections (eg, multiple …

被引用次数：111 相关文章所有 13 个版本

[PDF] arxiv.org

Video2commonsense: Generating commonsense descriptions to enrich video captioning

Z Fang, T Gokhale, P Banerjee, C Baral… - arXiv preprint arXiv …, 2020 - arxiv.org

Captioning is a crucial and challenging task for video understanding. In videos that involve
active agents such as humans, the agent's actions can bring about myriad changes in the …

被引用次数：67 相关文章所有 9 个版本

[PDF] thecvf.com

Multiple pairwise ranking networks for personalized video summarization

Y Saquil, D Chen, Y He, C Li… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com

In this paper, we investigate video summarization in the supervised setting. Since video
summarization is subjective to the preference of the end-user, the design of a unique model …

被引用次数：28 相关文章所有 7 个版本

[PDF] academia.edu

Discovering high quality answers in community question answering archives using a hierarchy of classifiers

H Toba, ZY Ming, M Adriani, TS Chua - Information Sciences, 2014 - Elsevier

In community-based question answering (CQA) services where answers are generated by
human, users may expect better answers than an automatic question answering system …

被引用次数：143 相关文章所有 7 个版本

[PDF] neurips.cc

MultiVENT: Multilingual Videos of Events and Aligned Natural Text

K Sanders, D Etter, R Kriz… - Advances in Neural …, 2023 - proceedings.neurips.cc

Everyday news coverage has shifted from traditional broadcasts towards a wide range of
presentation formats such as first-hand, unedited video footage. Datasets that reflect the …

被引用次数：4 相关文章所有 6 个版本

[PDF] aaai.org

Dynamic graph representation learning for video dialog via multi-modal shuffled transformers

S Geng, P Gao, M Chatterjee, C Hori… - Proceedings of the …, 2021 - ojs.aaai.org

Given an input video, its associated audio, and a brief caption, the audio-visual scene aware
dialog (AVSD) task requires an agent to indulge in a question-answer dialog with a human …

被引用次数：43 相关文章所有 11 个版本

[PDF] researchgate.net

A novel framework for semantic annotation and personalized retrieval of sports video

C Xu, J Wang, H Lu, Y Zhang - IEEE transactions on …, 2008 - ieeexplore.ieee.org

Sports video annotation is important for sports video semantic analysis such as event
detection and personalization. In this paper, we propose a novel approach for sports video …

被引用次数：181 相关文章所有 10 个版本

[PDF] researchgate.net

A hybrid watermarking scheme for H. 264/AVC video

G Qiu, P Marziliano, ATS Ho, D He… - Proceedings of the 17th …, 2004 - ieeexplore.ieee.org

A novel H. 264/AVC watermarking method is proposed in this paper. By embedding the
robust watermark into DCT domain and the fragile watermark into motion vectors …

被引用次数：203 相关文章所有 6 个版本

[PDF] ieee.org

Focal visual-text attention for memex question answering

J Liang, L Jiang, L Cao, Y Kalantidis… - IEEE transactions on …, 2019 - ieeexplore.ieee.org

Recent insights on language and vision with neural networks have been successfully
applied to simple single-image visual question answering. However, to tackle real-life …

被引用次数：68 相关文章所有 10 个版本