Deep reinforcement learning in computer vision: a comprehensive survey

N Le, VS Rathour, K Yamazaki, K Luu… - Artificial Intelligence …, 2022 - Springer
Deep reinforcement learning augments the reinforcement learning framework and utilizes
the powerful representation of deep neural networks. Recent works have demonstrated the …

Video object segmentation and tracking: A survey

R Yao, G Lin, S Xia, J Zhao, Y Zhou - ACM Transactions on Intelligent …, 2020 - dl.acm.org
Object segmentation and object tracking are fundamental research areas in the computer
vision community. These two topics are difficult to handle some common challenges, such …

Codef: Content deformation fields for temporally consistent video processing

H Ouyang, Q Wang, Y Xiao, Q Bai… - Proceedings of the …, 2024 - openaccess.thecvf.com
We present the content deformation field (CoDeF) as a new type of video representation
which consists of a canonical content field aggregating the static contents in the entire video …

MOSE: A new dataset for video object segmentation in complex scenes

H Ding, C Liu, S He, X Jiang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Video object segmentation (VOS) aims at segmenting a particular object throughout the
entire video clip sequence. The state-of-the-art VOS methods have achieved excellent …

Fast online object tracking and segmentation: A unifying approach

Q Wang, L Zhang, L Bertinetto… - Proceedings of the …, 2019 - openaccess.thecvf.com
In this paper we illustrate how to perform both visual object tracking and semi-supervised
video object segmentation, in real-time, with a single simple approach. Our method, dubbed …

See more, know more: Unsupervised video object segmentation with co-attention siamese networks

X Lu, W Wang, C Ma, J Shen… - Proceedings of the …, 2019 - openaccess.thecvf.com
We introduce a novel network, called as CO-attention Siamese Network (COSNet), to
address the unsupervised video object segmentation task from a holistic view. We …

Layered neural atlases for consistent video editing

Y Kasten, D Ofri, O Wang, T Dekel - ACM Transactions on Graphics …, 2021 - dl.acm.org
We present a method that decomposes, and" unwraps", an input video into a set of layered
2D atlases, each providing a unified representation of the appearance of an object (or …

Sstvos: Sparse spatiotemporal transformers for video object segmentation

B Duke, A Ahmed, C Wolf, P Aarabi… - Proceedings of the …, 2021 - openaccess.thecvf.com
In this paper we introduce a Transformer-based approach to video object segmentation
(VOS). To address compounding error and scalability issues of prior work, we propose a …

Splatnet: Sparse lattice networks for point cloud processing

H Su, V Jampani, D Sun, S Maji… - Proceedings of the …, 2018 - openaccess.thecvf.com
We present a network architecture for processing point clouds that directly operates on a
collection of points represented as a sparse set of samples in a high-dimensional lattice …

Youtube-vos: A large-scale video object segmentation benchmark

N Xu, L Yang, Y Fan, D Yue, Y Liang, J Yang… - arXiv preprint arXiv …, 2018 - arxiv.org
Learning long-term spatial-temporal features are critical for many video analysis tasks.
However, existing video segmentation methods predominantly rely on static image …