Deep reinforcement learning in computer vision: a comprehensive survey

N Le, VS Rathour, K Yamazaki, K Luu… - Artificial Intelligence …, 2022 - Springer
Deep reinforcement learning augments the reinforcement learning framework and utilizes
the powerful representation of deep neural networks. Recent works have demonstrated the …

Video object segmentation and tracking: A survey

R Yao, G Lin, S Xia, J Zhao, Y Zhou - ACM Transactions on Intelligent …, 2020 - dl.acm.org
Object segmentation and object tracking are fundamental research areas in the computer
vision community. These two topics are difficult to handle some common challenges, such …

Egovlpv2: Egocentric video-language pre-training with fusion in the backbone

S Pramanick, Y Song, S Nag, KQ Lin… - Proceedings of the …, 2023 - openaccess.thecvf.com
Video-language pre-training (VLP) has become increasingly important due to its ability to
generalize to various vision and language tasks. However, existing egocentric VLP …

Video summarization using deep neural networks: A survey

E Apostolidis, E Adamantidou, AI Metsai… - Proceedings of the …, 2021 - ieeexplore.ieee.org
Video summarization technologies aim to create a concise and complete synopsis by
selecting the most informative parts of the video content. Several approaches have been …

Learning 2d temporal adjacent networks for moment localization with natural language

S Zhang, H Peng, J Fu, J Luo - Proceedings of the AAAI Conference on …, 2020 - ojs.aaai.org
We address the problem of retrieving a specific moment from an untrimmed video by a query
sentence. This is a challenging problem because a target moment may take place in …

Context-aware biaffine localizing network for temporal sentence grounding

D Liu, X Qu, J Dong, P Zhou, Y Cheng… - Proceedings of the …, 2021 - openaccess.thecvf.com
This paper addresses the problem of temporal sentence grounding (TSG), which aims to
identify the temporal boundary of a specific segment from an untrimmed video by a sentence …

Deep reinforcement learning for unsupervised video summarization with diversity-representativeness reward

K Zhou, Y Qiao, T Xiang - Proceedings of the AAAI conference on …, 2018 - ojs.aaai.org
Video summarization aims to facilitate large-scale video browsing by producing short,
concise summaries that are diverse and representative of original videos. In this paper, we …

Learning temporal regularity in video sequences

M Hasan, J Choi, J Neumann… - Proceedings of the …, 2016 - openaccess.thecvf.com
Perceiving meaningful activities in a long video sequence is a challenging problem due to
ambiguous definition ofmeaningfulness' as well as clutters in the scene. We approach this …

Dsnet: A flexible detect-to-summarize network for video summarization

W Zhu, J Lu, J Li, J Zhou - IEEE Transactions on Image …, 2020 - ieeexplore.ieee.org
In this paper, we propose a Detect-to-Summarize network (DSNet) framework for supervised
video summarization. Our DSNet contains anchor-based and anchor-free counterparts. The …

Video summarization with long short-term memory

K Zhang, WL Chao, F Sha, K Grauman - … 14, 2016, Proceedings, Part VII 14, 2016 - Springer
We propose a novel supervised learning technique for summarizing videos by automatically
selecting keyframes or key subshots. Casting the task as a structured prediction problem …