Deep reinforcement learning in computer vision: a comprehensive survey

N Le, VS Rathour, K Yamazaki, K Luu… - Artificial Intelligence …, 2022 - Springer
Deep reinforcement learning augments the reinforcement learning framework and utilizes
the powerful representation of deep neural networks. Recent works have demonstrated the …

[HTML][HTML] An analytical study of information extraction from unstructured and multidimensional big data

K Adnan, R Akbar - Journal of Big Data, 2019 - Springer
Process of information extraction (IE) is used to extract useful information from unstructured
or semi-structured data. Big data arise new challenges for IE techniques with the rapid …

Univtg: Towards unified video-language temporal grounding

KQ Lin, P Zhang, J Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Video Temporal Grounding (VTG), which aims to ground target clips from videos
(such as consecutive intervals or disjoint shots) according to custom language queries (eg …

Egovlpv2: Egocentric video-language pre-training with fusion in the backbone

S Pramanick, Y Song, S Nag, KQ Lin… - Proceedings of the …, 2023 - openaccess.thecvf.com
Video-language pre-training (VLP) has become increasingly important due to its ability to
generalize to various vision and language tasks. However, existing egocentric VLP …

A survey of deep learning-based object detection

L Jiao, F Zhang, F Liu, S Yang, L Li, Z Feng… - IEEE access, 2019 - ieeexplore.ieee.org
Object detection is one of the most important and challenging branches of computer vision,
which has been widely applied in people's life, such as monitoring security, autonomous …

Video summarization using deep neural networks: A survey

E Apostolidis, E Adamantidou, AI Metsai… - Proceedings of the …, 2021 - ieeexplore.ieee.org
Video summarization technologies aim to create a concise and complete synopsis by
selecting the most informative parts of the video content. Several approaches have been …

Align and attend: Multimodal summarization with dual contrastive losses

B He, J Wang, J Qiu, T Bui… - Proceedings of the …, 2023 - openaccess.thecvf.com
The goal of multimodal summarization is to extract the most important information from
different modalities to form summaries. Unlike unimodal summarization, the multimodal …

Clip-it! language-guided video summarization

M Narasimhan, A Rohrbach… - Advances in neural …, 2021 - proceedings.neurips.cc
A generic video summary is an abridged version of a video that conveys the whole story and
features the most important scenes. Yet the importance of scenes in a video is often …

Deep reinforcement learning for unsupervised video summarization with diversity-representativeness reward

K Zhou, Y Qiao, T Xiang - Proceedings of the AAAI conference on …, 2018 - ojs.aaai.org
Video summarization aims to facilitate large-scale video browsing by producing short,
concise summaries that are diverse and representative of original videos. In this paper, we …

Unsupervised video summarization with adversarial lstm networks

B Mahasseni, M Lam… - Proceedings of the IEEE …, 2017 - openaccess.thecvf.com
This paper addresses the problem of unsupervised video summarization, formulated as
selecting a sparse subset of video frames that optimally represent the input video. Our key …