Semantic segmentation in compressed videos

A Li, Y Lu, Y Wang - 2019 IEEE 21st International Workshop on …, 2019 - ieeexplore.ieee.org
A Li, Y Lu, Y Wang
2019 IEEE 21st International Workshop on Multimedia Signal …, 2019ieeexplore.ieee.org
Existing approaches for semantic segmentation in videos usually extract each frame as an
RGB image, then apply standard image-based semantic segmentation models on each
frame. This is time-consuming. In this paper, we tackle this problem by exploring the nature
of video compression techniques. A compressed video contains three types of frames, I-
frames, P-frames, and B-frames. I-frames are represented as regular images, P-frames are
represented as motion vectors and residual errors, and B-frames are bidirectionally frames …
Existing approaches for semantic segmentation in videos usually extract each frame as an RGB image, then apply standard image-based semantic segmentation models on each frame. This is time-consuming. In this paper, we tackle this problem by exploring the nature of video compression techniques. A compressed video contains three types of frames, I-frames, P-frames, and B-frames. I-frames are represented as regular images, P-frames are represented as motion vectors and residual errors, and B-frames are bidirectionally frames that can be regarded as a special case of a P frame. We propose a method that directly operates on I-frames (as RGB images) and P-frames (motion vectors and residual errors) in a video. Our proposed model uses a ConvLSTM model to capture the temporal information in the video required for producing the semantic segmentation on P-frames. Our experimental results show that our method performs much faster than other alternatives while achieveing similar performance in terms of accuracies.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果