Transformer-based visual segmentation: A survey
Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …
segments or groups. This technique has numerous real-world applications, such as …
Open-vocabulary sam: Segment and recognize twenty-thousand classes interactively
Abstract The CLIP and Segment Anything Model (SAM) are remarkable vision foundation
models (VFMs). SAM excels in segmentation tasks across diverse domains, whereas CLIP is …
models (VFMs). SAM excels in segmentation tasks across diverse domains, whereas CLIP is …
Tube-Link: A flexible cross tube framework for universal video segmentation
Video segmentation aims to segment and track every pixel in diverse scenarios accurately.
In this paper, we present Tube-Link, a versatile framework that addresses multiple core tasks …
In this paper, we present Tube-Link, a versatile framework that addresses multiple core tasks …
Toward general-purpose robots via foundation models: A survey and meta-analysis
Building general-purpose robots that operate seamlessly in any environment, with any
object, and utilizing various skills to complete diverse tasks has been a long-standing goal in …
object, and utilizing various skills to complete diverse tasks has been a long-standing goal in …
Towards language-driven video inpainting via multimodal large language models
We introduce a new task--language-driven video inpainting which uses natural language
instructions to guide the inpainting process. This approach overcomes the limitations of …
instructions to guide the inpainting process. This approach overcomes the limitations of …
Mosaicfusion: Diffusion models as data augmenters for large vocabulary instance segmentation
We present MosaicFusion, a simple yet effective diffusion-based data augmentation
approach for large vocabulary instance segmentation. Our method is training-free and does …
approach for large vocabulary instance segmentation. Our method is training-free and does …
Betrayed by captions: Joint caption grounding and generation for open vocabulary instance segmentation
In this work, we focus on open vocabulary instance segmentation to expand a segmentation
model to classify and segment instance-level novel categories. Previous approaches have …
model to classify and segment instance-level novel categories. Previous approaches have …
Open-vocabulary video anomaly detection
Current video anomaly detection (VAD) approaches with weak supervisions are inherently
limited to a closed-set setting and may struggle in open-world applications where there can …
limited to a closed-set setting and may struggle in open-world applications where there can …
Domain generalization for semantic segmentation: A survey
TH Rafi, R Mahjabin, E Ghosh, YW Ko… - Artificial Intelligence …, 2024 - Springer
Deep neural networks (DNNs) have proven explicit contributions in making autonomous
driving cars and related tasks such as semantic segmentation, motion tracking, object …
driving cars and related tasks such as semantic segmentation, motion tracking, object …
Clip-ad: A language-guided staged dual-path model for zero-shot anomaly detection
This paper considers zero-shot Anomaly Detection (AD), performing AD without reference
images of the test objects. We propose a framework called CLIP-AD to leverage the zero …
images of the test objects. We propose a framework called CLIP-AD to leverage the zero …