Gres: Generalized referring expression segmentation
Abstract Referring Expression Segmentation (RES) aims to generate a segmentation mask
for the object described by a given language expression. Existing classic RES datasets and …
for the object described by a given language expression. Existing classic RES datasets and …
Tracking anything with decoupled video segmentation
Training data for video segmentation are expensive to annotate. This impedes extensions of
end-to-end algorithms to new video segmentation tasks, especially in large-vocabulary …
end-to-end algorithms to new video segmentation tasks, especially in large-vocabulary …
MOSE: A new dataset for video object segmentation in complex scenes
Video object segmentation (VOS) aims at segmenting a particular object throughout the
entire video clip sequence. The state-of-the-art VOS methods have achieved excellent …
entire video clip sequence. The state-of-the-art VOS methods have achieved excellent …
MeViS: A large-scale benchmark for video segmentation with motion expressions
This paper strives for motion expressions guided video segmentation, which focuses on
segmenting objects in video content based on a sentence describing the motion of the …
segmenting objects in video content based on a sentence describing the motion of the …
OMG-Seg: Is one model good enough for all segmentation?
In this work we address various segmentation tasks each traditionally tackled by distinct or
partially unified models. We propose OMG-Seg One Model that is Good enough to efficiently …
partially unified models. We propose OMG-Seg One Model that is Good enough to efficiently …
Bridging vision and language encoders: Parameter-efficient tuning for referring image segmentation
Parameter efficient tuning (PET) has received considerable attention owing to its
applicability to reduce the number of parameters that need to be updated while maintaining …
applicability to reduce the number of parameters that need to be updated while maintaining …
Transformer-based visual segmentation: A survey
Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …
segments or groups. This technique has numerous real-world applications, such as …
Towards open vocabulary learning: A survey
In the field of visual scene understanding, deep neural networks have made impressive
advancements in various core tasks like segmentation, tracking, and detection. However …
advancements in various core tasks like segmentation, tracking, and detection. However …
Hierarchical open-vocabulary universal image segmentation
Open-vocabulary image segmentation aims to partition an image into semantic regions
according to arbitrary text descriptions. However, complex visual scenes can be naturally …
according to arbitrary text descriptions. However, complex visual scenes can be naturally …
Spectrum-guided multi-granularity referring video object segmentation
Current referring video object segmentation (R-VOS) techniques extract conditional kernels
from encoded (low-resolution) vision-language features to segment the decoded high …
from encoded (low-resolution) vision-language features to segment the decoded high …