VLT: Vision-language transformer and query generation for referring segmentation

C Liu, H Ding, X Jiang - … of the IEEE/CVF conference on …, 2023 - openaccess.thecvf.com

Abstract Referring Expression Segmentation (RES) aims to generate a segmentation mask
for the object described by a given language expression. Existing classic RES datasets and …

被引用次数：116 相关文章所有 6 个版本

[PDF] thecvf.com

Tracking anything with decoupled video segmentation

HK Cheng, SW Oh, B Price… - Proceedings of the …, 2023 - openaccess.thecvf.com

Training data for video segmentation are expensive to annotate. This impedes extensions of
end-to-end algorithms to new video segmentation tasks, especially in large-vocabulary …

被引用次数：91 相关文章所有 7 个版本

[PDF] thecvf.com

MOSE: A new dataset for video object segmentation in complex scenes

H Ding, C Liu, S He, X Jiang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Video object segmentation (VOS) aims at segmenting a particular object throughout the
entire video clip sequence. The state-of-the-art VOS methods have achieved excellent …

被引用次数：98 相关文章所有 7 个版本

[PDF] thecvf.com

MeViS: A large-scale benchmark for video segmentation with motion expressions

H Ding, C Liu, S He, X Jiang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

This paper strives for motion expressions guided video segmentation, which focuses on
segmenting objects in video content based on a sentence describing the motion of the …

被引用次数：58 相关文章所有 6 个版本

[PDF] thecvf.com

OMG-Seg: Is one model good enough for all segmentation?

X Li, H Yuan, W Li, H Ding, S Wu… - Proceedings of the …, 2024 - openaccess.thecvf.com

In this work we address various segmentation tasks each traditionally tackled by distinct or
partially unified models. We propose OMG-Seg One Model that is Good enough to efficiently …

被引用次数：28 相关文章所有 3 个版本

[PDF] thecvf.com

Bridging vision and language encoders: Parameter-efficient tuning for referring image segmentation

Z Xu, Z Chen, Y Zhang, Y Song… - Proceedings of the …, 2023 - openaccess.thecvf.com

Parameter efficient tuning (PET) has received considerable attention owing to its
applicability to reduce the number of parameters that need to be updated while maintaining …

被引用次数：48 相关文章所有 5 个版本

[PDF] ieee.org

Transformer-based visual segmentation: A survey

X Li, H Ding, H Yuan, W Zhang, J Pang… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …

被引用次数：71 相关文章所有 3 个版本

[PDF] ieee.org

Towards open vocabulary learning: A survey

J Wu, X Li, S Xu, H Yuan, H Ding… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

In the field of visual scene understanding, deep neural networks have made impressive
advancements in various core tasks like segmentation, tracking, and detection. However …

被引用次数：90 相关文章所有 10 个版本

[PDF] neurips.cc

Hierarchical open-vocabulary universal image segmentation

X Wang, S Li, K Kallidromitis, Y Kato… - Advances in …, 2024 - proceedings.neurips.cc

Open-vocabulary image segmentation aims to partition an image into semantic regions
according to arbitrary text descriptions. However, complex visual scenes can be naturally …

被引用次数：27 相关文章所有 5 个版本

[PDF] thecvf.com

Spectrum-guided multi-granularity referring video object segmentation

B Miao, M Bennamoun, Y Gao… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Current referring video object segmentation (R-VOS) techniques extract conditional kernels
from encoded (low-resolution) vision-language features to segment the decoded high …

被引用次数：32 相关文章所有 7 个版本