Deep learning for visual tracking: A comprehensive survey

SM Marvasti-Zadeh, L Cheng… - IEEE Transactions …, 2021 - ieeexplore.ieee.org
Visual target tracking is one of the most sought-after yet challenging research topics in
computer vision. Given the ill-posed nature of the problem and its popularity in a broad …

Video object segmentation and tracking: A survey

R Yao, G Lin, S Xia, J Zhao, Y Zhou - ACM Transactions on Intelligent …, 2020 - dl.acm.org
Object segmentation and object tracking are fundamental research areas in the computer
vision community. These two topics are difficult to handle some common challenges, such …

Visual prompt multi-modal tracking

J Zhu, S Lai, X Chen, D Wang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Visible-modal object tracking gives rise to a series of downstream multi-modal tracking
tributaries. To inherit the powerful representations of the foundation model, a natural modus …

Sam 2: Segment anything in images and videos

N Ravi, V Gabeur, YT Hu, R Hu, C Ryali, T Ma… - arXiv preprint arXiv …, 2024 - arxiv.org
We present Segment Anything Model 2 (SAM 2), a foundation model towards solving
promptable visual segmentation in images and videos. We build a data engine, which …

MOSE: A new dataset for video object segmentation in complex scenes

H Ding, C Liu, S He, X Jiang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Video object segmentation (VOS) aims at segmenting a particular object throughout the
entire video clip sequence. The state-of-the-art VOS methods have achieved excellent …

Towards an end-to-end framework for flow-guided video inpainting

Z Li, CZ Lu, J Qin, CL Guo… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Optical flow, which captures motion information across frames, is exploited in recent video
inpainting methods through propagating pixels along its trajectories. However, the hand …

Propainter: Improving propagation and transformer for video inpainting

S Zhou, C Li, KCK Chan… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Flow-based propagation and spatiotemporal Transformer are two mainstream mechanisms
in video inpainting (VI). Despite the effectiveness of these components, they still suffer from …

Siam r-cnn: Visual tracking by re-detection

P Voigtlaender, J Luiten, PHS Torr… - Proceedings of the …, 2020 - openaccess.thecvf.com
Abstract We present Siam R-CNN, a Siamese re-detection architecture which unleashes the
full power of two-stage object detection approaches for visual object tracking. We combine …

End-to-end referring video object segmentation with multimodal transformers

A Botach, E Zheltonozhskii… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
The referring video object segmentation task (RVOS) involves segmentation of a text-
referred object instance in the frames of a given video. Due to the complex nature of this …

Video object segmentation with episodic graph memory networks

X Lu, W Wang, M Danelljan, T Zhou, J Shen… - Computer Vision–ECCV …, 2020 - Springer
How to make a segmentation model efficiently adapt to a specific video as well as online
target appearance variations is a fundamental issue in the field of video object …