Large-scale video panoptic segmentation in the wild: A benchmark

X Chen, L Huang, Y Liu, Y Shen… - Proceedings of the …, 2024 - openaccess.thecvf.com

This work presents AnyDoor a diffusion-based image generator with the power to teleport
target objects to new scenes at user-specified locations with desired shapes. Instead of …

被引用次数：117 相关文章所有 3 个版本

[PDF] thecvf.com

Sequential modeling enables scalable learning for large vision models

Y Bai, X Geng, K Mangalam, A Bar… - Proceedings of the …, 2024 - openaccess.thecvf.com

We introduce a novel sequential modeling approach which enables learning a Large Vision
Model (LVM) without making use of any linguistic data. To do this we define a common …

被引用次数：71 相关文章所有 3 个版本

[PDF] thecvf.com

Tracking anything with decoupled video segmentation

HK Cheng, SW Oh, B Price… - Proceedings of the …, 2023 - openaccess.thecvf.com

Training data for video segmentation are expensive to annotate. This impedes extensions of
end-to-end algorithms to new video segmentation tasks, especially in large-vocabulary …

被引用次数：70 相关文章所有 7 个版本

[PDF] thecvf.com

MOSE: A new dataset for video object segmentation in complex scenes

H Ding, C Liu, S He, X Jiang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Video object segmentation (VOS) aims at segmenting a particular object throughout the
entire video clip sequence. The state-of-the-art VOS methods have achieved excellent …

被引用次数：92 相关文章所有 7 个版本

[PDF] thecvf.com

OMG-Seg: Is one model good enough for all segmentation?

X Li, H Yuan, W Li, H Ding, S Wu… - Proceedings of the …, 2024 - openaccess.thecvf.com

In this work we address various segmentation tasks each traditionally tackled by distinct or
partially unified models. We propose OMG-Seg One Model that is Good enough to efficiently …

被引用次数：26 相关文章所有 3 个版本

[PDF] ieee.org

Transformer-based visual segmentation: A survey

X Li, H Ding, H Yuan, W Zhang, J Pang… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …

被引用次数：65 相关文章所有 3 个版本

[PDF] arxiv.org

Sam 2: Segment anything in images and videos

N Ravi, V Gabeur, YT Hu, R Hu, C Ryali, T Ma… - arXiv preprint arXiv …, 2024 - arxiv.org

We present Segment Anything Model 2 (SAM 2), a foundation model towards solving
promptable visual segmentation in images and videos. We build a data engine, which …

被引用次数：35 相关文章所有 2 个版本

[PDF] thecvf.com

Video k-net: A simple, strong, and unified baseline for video segmentation

X Li, W Zhang, J Pang, K Chen… - Proceedings of the …, 2022 - openaccess.thecvf.com

This paper presents Video K-Net, a simple, strong, and unified framework for fully end-to-
end video panoptic segmentation. The method is built upon K-Net, a method that unifies …

被引用次数：88 相关文章所有 6 个版本

[PDF] thecvf.com

Tube-Link: A flexible cross tube framework for universal video segmentation

X Li, H Yuan, W Zhang, G Cheng… - Proceedings of the …, 2023 - openaccess.thecvf.com

Video segmentation aims to segment and track every pixel in diverse scenarios accurately.
In this paper, we present Tube-Link, a versatile framework that addresses multiple core tasks …

被引用次数：41 相关文章所有 5 个版本

[PDF] neurips.cc

Learning mask-aware clip representations for zero-shot segmentation

S Jiao, Y Wei, Y Wang, Y Zhao… - Advances in Neural …, 2023 - proceedings.neurips.cc

Recently, pre-trained vision-language models have been increasingly used to tackle the
challenging zero-shot segmentation task. Typical solutions follow the paradigm of first …

被引用次数：22 相关文章所有 5 个版本