Transformer-based visual segmentation: A survey
Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …
segments or groups. This technique has numerous real-world applications, such as …
Omg-llava: Bridging image-level, object-level, pixel-level reasoning and understanding
Current universal segmentation methods demonstrate strong capabilities in pixel-level
image and video understanding. However, they lack reasoning abilities and cannot be …
image and video understanding. However, they lack reasoning abilities and cannot be …
Visa: Reasoning video object segmentation via large language models
Existing Video Object Segmentation (VOS) relies on explicit user instructions, such as
categories, masks, or short phrases, restricting their ability to perform complex video …
categories, masks, or short phrases, restricting their ability to perform complex video …
ViLLa: Video Reasoning Segmentation with Large Language Model
Although video perception models have made remarkable advancements in recent years,
they still heavily rely on explicit text descriptions or pre-defined categories to identify target …
they still heavily rely on explicit text descriptions or pre-defined categories to identify target …