End-to-end dense video captioning with parallel decoding

T Wang, R Zhang, Z Lu, F Zheng… - Proceedings of the …, 2021 - openaccess.thecvf.com
Dense video captioning aims to generate multiple associated captions with their temporal
locations from the video. Previous methods follow a sophisticated" localize-then-describe" …

Video captioning: a review of theory, techniques and practices.

V Jain, F Al-Turjman, G Chaudhary… - Multimedia Tools & …, 2022 - search.ebscohost.com
In today's world, video captioning is extensively used in various applications for specially-
abled and, more specifically, visually abled persons. With advancements in technology for …

A review of deep learning for video captioning

M Abdar, M Kollati, S Kuraparthi, F Pourpanah… - arXiv preprint arXiv …, 2023 - arxiv.org
Video captioning (VC) is a fast-moving, cross-disciplinary area of research that bridges work
in the fields of computer vision, natural language processing (NLP), linguistics, and human …

Controllable video captioning with pos sequence guidance based on gated fusion network

B Wang, L Ma, W Zhang, W Jiang… - Proceedings of the …, 2019 - openaccess.thecvf.com
In this paper, we propose to guide the video caption generation with Part-of-Speech (POS)
information, based on a gated fusion of multiple representations of input videos. We …

Iterative alignment network for continuous sign language recognition

J Pu, W Zhou, H Li - … of the IEEE/CVF conference on …, 2019 - openaccess.thecvf.com
In this paper, we propose an alignment network with iterative optimization for weakly
supervised continuous sign language recognition. Our framework consists of two modules: a …

Object-aware aggregation with bidirectional temporal graph for video captioning

J Zhang, Y Peng - Proceedings of the IEEE/CVF conference …, 2019 - openaccess.thecvf.com
Video captioning aims to automatically generate natural language descriptions of video
content, which has drawn a lot of attention recent years. Generating accurate and fine …

Adapt: Action-aware driving caption transformer

B Jin, X Liu, Y Zheng, P Li, H Zhao… - … on Robotics and …, 2023 - ieeexplore.ieee.org
End-to-end autonomous driving has great potential in the transportation industry. However,
the lack of transparency and interpretability of the automatic decision-making process …

Learning modality interaction for temporal sentence localization and event captioning in videos

S Chen, W Jiang, W Liu, YG Jiang - … , Glasgow, UK, August 23–28, 2020 …, 2020 - Springer
Automatically generating sentences to describe events and temporally localizing sentences
in a video are two important tasks that bridge language and videos. Recent techniques …

Sibnet: Sibling convolutional encoder for video captioning

S Liu, Z Ren, J Yuan - Proceedings of the 26th ACM international …, 2018 - dl.acm.org
Video captioning is a challenging task owing to the complexity of understanding the copious
visual information in videos and describing it using natural language. Different from previous …

Towards bridging event captioner and sentence localizer for weakly supervised dense event captioning

S Chen, YG Jiang - … of the IEEE/CVF Conference on …, 2021 - openaccess.thecvf.com
Abstract Dense Event Captioning (DEC) aims to jointly localize and describe multiple events
of interest in untrimmed videos, which is an advancement of the conventional video …