[HTML][HTML] Review of large vision models and visual prompt engineering
Visual prompt engineering is a fundamental methodology in the field of visual and image
artificial general intelligence. As the development of large vision models progresses, the …
artificial general intelligence. As the development of large vision models progresses, the …
[HTML][HTML] A survey of recent interactive image segmentation methods
Image segmentation is one of the most basic tasks in computer vision and remains an initial
step of many applications. In this paper, we focus on interactive image segmentation (IIS) …
step of many applications. In this paper, we focus on interactive image segmentation (IIS) …
Medical sam adapter: Adapting segment anything model for medical image segmentation
The Segment Anything Model (SAM) has recently gained popularity in the field of image
segmentation due to its impressive capabilities in various segmentation tasks and its prompt …
segmentation due to its impressive capabilities in various segmentation tasks and its prompt …
Focalclick: Towards practical interactive image segmentation
X Chen, Z Zhao, Y Zhang, M Duan… - Proceedings of the …, 2022 - openaccess.thecvf.com
Interactive segmentation allows users to extract target masks by making positive/negative
clicks. Although explored by many previous works, there is still a gap between academic …
clicks. Although explored by many previous works, there is still a gap between academic …
Simpleclick: Interactive image segmentation with simple vision transformers
Click-based interactive image segmentation aims at extracting objects with a limited user
clicking. A hierarchical backbone is the de-facto architecture for current methods. Recently …
clicking. A hierarchical backbone is the de-facto architecture for current methods. Recently …
Res2net: A new multi-scale backbone architecture
Representing features at multiple scales is of great importance for numerous vision tasks.
Recent advances in backbone convolutional neural networks (CNNs) continually …
Recent advances in backbone convolutional neural networks (CNNs) continually …
UC-Net: Uncertainty inspired RGB-D saliency detection via conditional variational autoencoders
In this paper, we propose the first framework (UCNet) to employ uncertainty for RGB-D
saliency detection by learning from the data labeling process. Existing RGB-D saliency …
saliency detection by learning from the data labeling process. Existing RGB-D saliency …
Reviving iterative training with mask guidance for interactive segmentation
Recent works on click-based interactive segmentation have demonstrated state-of-the-art
results by using various inference-time optimization schemes. These methods are …
results by using various inference-time optimization schemes. These methods are …
Interactive medical image annotation using improved Attention U-net with compound geodesic distance
Y Zhang, J Chen, X Ma, G Wang, UA Bhatti… - Expert systems with …, 2024 - Elsevier
Accurate and massive medical image annotation data is crucial for diagnosis, surgical
planning, and deep learning in the development of medical images. However, creating large …
planning, and deep learning in the development of medical images. However, creating large …
Caption anything: Interactive image description with diverse multimodal controls
Controllable image captioning is an emerging multimodal topic that aims to describe the
image with natural language following human purpose, $\textit {eg} $, looking at the …
image with natural language following human purpose, $\textit {eg} $, looking at the …