Sam 2: Segment anything in images and videos

N Ravi, V Gabeur, YT Hu, R Hu, C Ryali, T Ma… - arXiv preprint arXiv …, 2024 - arxiv.org
We present Segment Anything Model 2 (SAM 2), a foundation model towards solving
promptable visual segmentation in images and videos. We build a data engine, which …

Integrating boxes and masks: A multi-object framework for unified visual tracking and segmentation

Y Xu, Z Yang, Y Yang - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Tracking any given object (s) spatially and temporally is a common purpose in Visual Object
Tracking (VOT) and Video Object Segmentation (VOS). Joint tracking and segmentation …

Omg-llava: Bridging image-level, object-level, pixel-level reasoning and understanding

T Zhang, X Li, H Fei, H Yuan, S Wu, S Ji… - arXiv preprint arXiv …, 2024 - arxiv.org
Current universal segmentation methods demonstrate strong capabilities in pixel-level
image and video understanding. However, they lack reasoning abilities and cannot be …

Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation

H Fang, P Wu, Y Li, X Zhang, X Lu - arXiv preprint arXiv:2407.07427, 2024 - arxiv.org
Open-Vocabulary Video Instance Segmentation (VIS) is attracting increasing attention due
to its ability to segment and track arbitrary objects. However, the recent Open-Vocabulary …

Cold SegDiffusion: A novel diffusion model for medical image segmentation

P Yan, M Li, J Zhang, G Li, Y Jiang, H Luo - Knowledge-Based Systems, 2024 - Elsevier
Medical image segmentation is crucial in accurately identifying and delineating regions of
interest in medical images, which can inform the diagnosis and treatment of various …

StAlK: Structural Alignment based Self Knowledge distillation for Medical Image Classification

S Sharma, A Kumar, J Monpara, J Chandra - Knowledge-Based Systems, 2024 - Elsevier
In the realm of medical image analysis, where challenges like high class imbalance, inter-
class similarity, and intra-class variance are prevalent, knowledge distillation has emerged …

VPE-WSVAD: Visual prompt exemplars for weakly-supervised video anomaly detection

Y Su, Y Tan, M Xing, S An - Knowledge-Based Systems, 2024 - Elsevier
Abstract Weakly Supervised Video Anomaly Detection (WSVAD) plays a crucial role in
visual surveillance by effectively distinguishing anomalies from normality with only video …

Temporo-Spatial Parallel Sparse Memory Networks for Efficient Video Object Segmentation

J Dang, H Zheng, B Wang, L Wang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Memory-based networks have achieved tremendous success in video object segmentation.
However, these methods still suffer from unfaithful segmentation and inferior efficiency under …

1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation

M Gao, J Luo, J Yang, J Han, F Zheng - arXiv preprint arXiv:2406.07043, 2024 - arxiv.org
Motion Expression guided Video Segmentation (MeViS), as an emerging task, poses many
new challenges to the field of referring video object segmentation (RVOS). In this technical …

Noise-Tolerant Hybrid Prototypical Learning with Noisy Web Data

C Liang, L Zhu, Z Yang, W Chen, Y Yang - ACM Transactions on Multimedia … - dl.acm.org
We focus on the challenging problem of learning an unbiased classifier from a large number
of potentially relevant but noisily labeled web images given only a few clean labeled …