Anydoor: Zero-shot object-level image customization

X Chen, L Huang, Y Liu, Y Shen… - Proceedings of the …, 2024 - openaccess.thecvf.com
This work presents AnyDoor a diffusion-based image generator with the power to teleport
target objects to new scenes at user-specified locations with desired shapes. Instead of …

Sequential modeling enables scalable learning for large vision models

Y Bai, X Geng, K Mangalam, A Bar… - Proceedings of the …, 2024 - openaccess.thecvf.com
We introduce a novel sequential modeling approach which enables learning a Large Vision
Model (LVM) without making use of any linguistic data. To do this we define a common …

Tracking anything with decoupled video segmentation

HK Cheng, SW Oh, B Price… - Proceedings of the …, 2023 - openaccess.thecvf.com
Training data for video segmentation are expensive to annotate. This impedes extensions of
end-to-end algorithms to new video segmentation tasks, especially in large-vocabulary …

MOSE: A new dataset for video object segmentation in complex scenes

H Ding, C Liu, S He, X Jiang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Video object segmentation (VOS) aims at segmenting a particular object throughout the
entire video clip sequence. The state-of-the-art VOS methods have achieved excellent …

OMG-Seg: Is one model good enough for all segmentation?

X Li, H Yuan, W Li, H Ding, S Wu… - Proceedings of the …, 2024 - openaccess.thecvf.com
In this work we address various segmentation tasks each traditionally tackled by distinct or
partially unified models. We propose OMG-Seg One Model that is Good enough to efficiently …

Transformer-based visual segmentation: A survey

X Li, H Ding, H Yuan, W Zhang, J Pang… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …

Sam 2: Segment anything in images and videos

N Ravi, V Gabeur, YT Hu, R Hu, C Ryali, T Ma… - arXiv preprint arXiv …, 2024 - arxiv.org
We present Segment Anything Model 2 (SAM 2), a foundation model towards solving
promptable visual segmentation in images and videos. We build a data engine, which …

Video k-net: A simple, strong, and unified baseline for video segmentation

X Li, W Zhang, J Pang, K Chen… - Proceedings of the …, 2022 - openaccess.thecvf.com
This paper presents Video K-Net, a simple, strong, and unified framework for fully end-to-
end video panoptic segmentation. The method is built upon K-Net, a method that unifies …

Tube-Link: A flexible cross tube framework for universal video segmentation

X Li, H Yuan, W Zhang, G Cheng… - Proceedings of the …, 2023 - openaccess.thecvf.com
Video segmentation aims to segment and track every pixel in diverse scenarios accurately.
In this paper, we present Tube-Link, a versatile framework that addresses multiple core tasks …

Learning mask-aware clip representations for zero-shot segmentation

S Jiao, Y Wei, Y Wang, Y Zhao… - Advances in Neural …, 2023 - proceedings.neurips.cc
Recently, pre-trained vision-language models have been increasingly used to tackle the
challenging zero-shot segmentation task. Typical solutions follow the paradigm of first …