Streaming dense video captioning

X Zhou, A Arnab, S Buch, S Yan… - Proceedings of the …, 2024 - openaccess.thecvf.com
… a streaming model for dense video captioning as shown in Fig. 1. Our streaming model does
not require access to all input frames concurrently in order to process the video thanks to a …

Streamlined dense video captioning

J Mun, L Yang, Z Ren, N Xu… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
dense video captioning framework, which models temporal dependency across events in
a video … of event proposals to our sequential video captioning network, which is trained by …

Multi-modal dense video captioning

V Iashin, E Rahtu - … of the IEEE/CVF conference on …, 2020 - openaccess.thecvf.com
dense video captions for an example video sequence. Most recent works in dense video
captioning formulate the captioning … of features extracted from the video stream and the output is …

An efficient framework for dense video captioning

M Suin, AN Rajagopalan - Proceedings of the AAAI Conference on Artificial …, 2020 - aaai.org
… This is in part due to the huge size of raw video streams and the presence of redundant
information in the frames. Most of the existing frameworks, for every time step, need to pass the …

End-to-end dense video captioning with parallel decoding

T Wang, R Zhang, Z Lu, F Zheng… - Proceedings of the …, 2021 - openaccess.thecvf.com
… In practice, we consider the dense video captioning task as a set prediction problem. The
proposed PDVC directly decodes the frame features, which are extracted from a Vision …

Dense relational captioning: Triple-stream networks for relationship-based captioning

DJ Kim, J Choi, TH Oh… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
… MTTSNet denotes our final model, multi-task triple-stream network with POS classifier. … for
dense captioning task [12], we suggest a new evaluation metric for relational dense captioning. …

Environment-aware dense video captioning for IoT-enabled edge cameras

CH Lu, GY Fan - IEEE Internet of Things Journal, 2021 - ieeexplore.ieee.org
dense-videocaptioning model based on the Transformer framework to improve execution
efficiency for video-caption … The source of ActivityNet’s videos is the streaming video platform …

Dense-captioning events in videos

R Krishna, K Hata, F Ren, L Fei-Fei… - Proceedings of the …, 2017 - openaccess.thecvf.com
… In addition, we show a variant of our captioning module that can operate on streaming
videos by attending over only the past events. Our full model attends over both past as well as …

Attention-based densely connected LSTM for video captioning

Y Zhu, S Jiang - Proceedings of the 27th ACM international conference …, 2019 - dl.acm.org
streams). To more effectively combine different modalities, they trained modalityspecific
LSTMs to capture the intrinsic representations of individual modalities. For video captioning, the …

End-to-end dense video captioning with masked transformer

L Zhou, Y Zhou, JJ Corso… - Proceedings of the …, 2018 - openaccess.thecvf.com
… and the captioning modules … dense video captioning that is able to produce proposal and
description simultaneously. Also, our work directly incorporates the semantics from captions to …