[HTML][HTML] Review of large vision models and visual prompt engineering

J Wang, Z Liu, L Zhao, Z Wu, C Ma, S Yu, H Dai… - Meta-Radiology, 2023 - Elsevier
Visual prompt engineering is a fundamental methodology in the field of visual and image
artificial general intelligence. As the development of large vision models progresses, the …

[HTML][HTML] A survey of recent interactive image segmentation methods

H Ramadan, C Lachqar, H Tairi - Computational visual media, 2020 - Springer
Image segmentation is one of the most basic tasks in computer vision and remains an initial
step of many applications. In this paper, we focus on interactive image segmentation (IIS) …

Medical sam adapter: Adapting segment anything model for medical image segmentation

J Wu, W Ji, Y Liu, H Fu, M Xu, Y Xu, Y Jin - arXiv preprint arXiv:2304.12620, 2023 - arxiv.org
The Segment Anything Model (SAM) has recently gained popularity in the field of image
segmentation due to its impressive capabilities in various segmentation tasks and its prompt …

Focalclick: Towards practical interactive image segmentation

X Chen, Z Zhao, Y Zhang, M Duan… - Proceedings of the …, 2022 - openaccess.thecvf.com
Interactive segmentation allows users to extract target masks by making positive/negative
clicks. Although explored by many previous works, there is still a gap between academic …

Simpleclick: Interactive image segmentation with simple vision transformers

Q Liu, Z Xu, G Bertasius… - Proceedings of the …, 2023 - openaccess.thecvf.com
Click-based interactive image segmentation aims at extracting objects with a limited user
clicking. A hierarchical backbone is the de-facto architecture for current methods. Recently …

Res2net: A new multi-scale backbone architecture

SH Gao, MM Cheng, K Zhao, XY Zhang… - IEEE transactions on …, 2019 - ieeexplore.ieee.org
Representing features at multiple scales is of great importance for numerous vision tasks.
Recent advances in backbone convolutional neural networks (CNNs) continually …

UC-Net: Uncertainty inspired RGB-D saliency detection via conditional variational autoencoders

J Zhang, DP Fan, Y Dai, S Anwar… - Proceedings of the …, 2020 - openaccess.thecvf.com
In this paper, we propose the first framework (UCNet) to employ uncertainty for RGB-D
saliency detection by learning from the data labeling process. Existing RGB-D saliency …

Reviving iterative training with mask guidance for interactive segmentation

K Sofiiuk, IA Petrov, A Konushin - 2022 IEEE International …, 2022 - ieeexplore.ieee.org
Recent works on click-based interactive segmentation have demonstrated state-of-the-art
results by using various inference-time optimization schemes. These methods are …

Interactive medical image annotation using improved Attention U-net with compound geodesic distance

Y Zhang, J Chen, X Ma, G Wang, UA Bhatti… - Expert systems with …, 2024 - Elsevier
Accurate and massive medical image annotation data is crucial for diagnosis, surgical
planning, and deep learning in the development of medical images. However, creating large …

Caption anything: Interactive image description with diverse multimodal controls

T Wang, J Zhang, J Fei, H Zheng, Y Tang, Z Li… - arXiv preprint arXiv …, 2023 - arxiv.org
Controllable image captioning is an emerging multimodal topic that aims to describe the
image with natural language following human purpose, $\textit {eg} $, looking at the …