Streaming dense video captioning
… a streaming model for dense video captioning as shown in Fig. 1. Our streaming model does
not require access to all input frames concurrently in order to process the video thanks to a …
not require access to all input frames concurrently in order to process the video thanks to a …
Streamlined dense video captioning
… dense video captioning framework, which models temporal dependency across events in
a video … of event proposals to our sequential video captioning network, which is trained by …
a video … of event proposals to our sequential video captioning network, which is trained by …
Multi-modal dense video captioning
… dense video captions for an example video sequence. Most recent works in dense video
captioning formulate the captioning … of features extracted from the video stream and the output is …
captioning formulate the captioning … of features extracted from the video stream and the output is …
An efficient framework for dense video captioning
M Suin, AN Rajagopalan - Proceedings of the AAAI Conference on Artificial …, 2020 - aaai.org
… This is in part due to the huge size of raw video streams and the presence of redundant
information in the frames. Most of the existing frameworks, for every time step, need to pass the …
information in the frames. Most of the existing frameworks, for every time step, need to pass the …
End-to-end dense video captioning with parallel decoding
… In practice, we consider the dense video captioning task as a set prediction problem. The
proposed PDVC directly decodes the frame features, which are extracted from a Vision …
proposed PDVC directly decodes the frame features, which are extracted from a Vision …
Dense relational captioning: Triple-stream networks for relationship-based captioning
… MTTSNet denotes our final model, multi-task triple-stream network with POS classifier. … for
dense captioning task [12], we suggest a new evaluation metric for relational dense captioning. …
dense captioning task [12], we suggest a new evaluation metric for relational dense captioning. …
Environment-aware dense video captioning for IoT-enabled edge cameras
CH Lu, GY Fan - IEEE Internet of Things Journal, 2021 - ieeexplore.ieee.org
… dense-videocaptioning model based on the Transformer framework to improve execution
efficiency for video-caption … The source of ActivityNet’s videos is the streaming video platform …
efficiency for video-caption … The source of ActivityNet’s videos is the streaming video platform …
Dense-captioning events in videos
… In addition, we show a variant of our captioning module that can operate on streaming
videos by attending over only the past events. Our full model attends over both past as well as …
videos by attending over only the past events. Our full model attends over both past as well as …
Attention-based densely connected LSTM for video captioning
Y Zhu, S Jiang - Proceedings of the 27th ACM international conference …, 2019 - dl.acm.org
… streams). To more effectively combine different modalities, they trained modalityspecific
LSTMs to capture the intrinsic representations of individual modalities. For video captioning, the …
LSTMs to capture the intrinsic representations of individual modalities. For video captioning, the …
End-to-end dense video captioning with masked transformer
… and the captioning modules … dense video captioning that is able to produce proposal and
description simultaneously. Also, our work directly incorporates the semantics from captions to …
description simultaneously. Also, our work directly incorporates the semantics from captions to …
相关搜索
- multi-modal dense video captioning
- dense video captioning masked transformer
- dense video captioning large scale pretraining
- dense video captioning audio visual cues
- dense video captioning parallel decoding
- dense video captioning edge cameras
- dense video captioning efficient framework
- dense video captioning information fusion
- dense video captioning hierarchical representation
- densely connected lstm video captioning
- video captioning challenge
- sparse attention video captioning
- hierarchical modular network video captioning
- understanding objects video captioning
- meta concepts video captioning
- relational graph video captioning