A comprehensive survey on segment anything model for vision and beyond

C Zhang, L Liu, Y Cui, G Huang, W Lin, Y Yang… - arXiv preprint arXiv …, 2023 - arxiv.org
Artificial intelligence (AI) is evolving towards artificial general intelligence, which refers to the
ability of an AI system to perform a wide range of tasks and exhibit a level of intelligence …

Llama-adapter: Efficient fine-tuning of language models with zero-init attention

R Zhang, J Han, C Liu, P Gao, A Zhou, X Hu… - arXiv preprint arXiv …, 2023 - arxiv.org
We present LLaMA-Adapter, a lightweight adaption method to efficiently fine-tune LLaMA
into an instruction-following model. Using 52K self-instruct demonstrations, LLaMA-Adapter …

Multimodal foundation models: From specialists to general-purpose assistants

C Li, Z Gan, Z Yang, J Yang, L Li… - … and Trends® in …, 2024 - nowpublishers.com
Neural compression is the application of neural networks and other machine learning
methods to data compression. Recent advances in statistical machine learning have opened …

RSPrompter: Learning to prompt for remote sensing instance segmentation based on visual foundation model

K Chen, C Liu, H Chen, H Zhang, W Li… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Leveraging the extensive training data from SA-1B, the segment anything model (SAM)
demonstrates remarkable generalization and zero-shot capabilities. However, as a category …

[HTML][HTML] The segment anything model (sam) for remote sensing applications: From zero to one shot

LP Osco, Q Wu, EL de Lemos, WN Gonçalves… - International Journal of …, 2023 - Elsevier
Segmentation is an essential step for remote sensing image processing. This study aims to
advance the application of the Segment Anything Model (SAM), an innovative image …

Matting anything

J Li, J Jain, H Shi - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
In this paper we propose the Matting Anything Model (MAM) an efficient and versatile
framework for estimating the alpha matte of any instance in an image with flexible and …

Pointclip v2: Prompting clip and gpt for powerful 3d open-world learning

X Zhu, R Zhang, B He, Z Guo, Z Zeng… - Proceedings of the …, 2023 - openaccess.thecvf.com
Large-scale pre-trained models have shown promising open-world performance for both
vision and language tasks. However, their transferred capacity on 3D point clouds is still …

Not all features matter: Enhancing few-shot clip with adaptive prior refinement

X Zhu, R Zhang, B He, A Zhou… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract The popularity of Contrastive Language-Image Pre-training (CLIP) has propelled its
application to diverse downstream vision tasks. To improve its capacity on downstream …

Samrs: Scaling-up remote sensing segmentation dataset with segment anything model

D Wang, J Zhang, B Du, M Xu, L Liu… - Advances in Neural …, 2024 - proceedings.neurips.cc
The success of the Segment Anything Model (SAM) demonstrates the significance of data-
centric machine learning. However, due to the difficulties and high costs associated with …

Sam-6d: Segment anything model meets zero-shot 6d object pose estimation

J Lin, L Liu, D Lu, K Jia - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
Zero-shot 6D object pose estimation involves the detection of novel objects with their 6D
poses in cluttered scenes presenting significant challenges for model generalizability …