[HTML][HTML] Review of large vision models and visual prompt engineering

J Wang, Z Liu, L Zhao, Z Wu, C Ma, S Yu, H Dai… - Meta-Radiology, 2023 - Elsevier
Visual prompt engineering is a fundamental methodology in the field of visual and image
artificial general intelligence. As the development of large vision models progresses, the …

Large language models: a comprehensive survey of its applications, challenges, limitations, and future prospects

MU Hadi, Q Al Tashi, A Shah, R Qureshi… - Authorea …, 2024 - authorea.com
Within the vast expanse of computerized language processing, a revolutionary entity known
as Large Language Models (LLMs) has emerged, wielding immense power in its capacity to …

Segment anything model for medical image analysis: an experimental study

MA Mazurowski, H Dong, H Gu, J Yang, N Konz… - Medical Image …, 2023 - Elsevier
Training segmentation models for medical images continues to be challenging due to the
limited availability of data annotations. Segment Anything Model (SAM) is a foundation …

Segment anything model for medical images?

Y Huang, X Yang, L Liu, H Zhou, A Chang, X Zhou… - Medical Image …, 2024 - Elsevier
Abstract The Segment Anything Model (SAM) is the first foundation model for general image
segmentation. It has achieved impressive results on various natural image segmentation …

Medical sam adapter: Adapting segment anything model for medical image segmentation

J Wu, W Ji, Y Liu, H Fu, M Xu, Y Xu, Y Jin - arXiv preprint arXiv:2304.12620, 2023 - arxiv.org
The Segment Anything Model (SAM) has recently gained popularity in the field of image
segmentation due to its impressive capabilities in various segmentation tasks and its prompt …

Multimodal foundation models: From specialists to general-purpose assistants

C Li, Z Gan, Z Yang, J Yang, L Li… - … and Trends® in …, 2024 - nowpublishers.com
Neural compression is the application of neural networks and other machine learning
methods to data compression. Recent advances in statistical machine learning have opened …

Sam-clip: Merging vision foundation models towards semantic and spatial understanding

H Wang, PKA Vasu, F Faghri… - Proceedings of the …, 2024 - openaccess.thecvf.com
The landscape of publicly available vision foundation models (VFMs) such as CLIP and
SAM is expanding rapidly. VFMs are endowed with distinct capabilities stemming from their …

U-mamba: Enhancing long-range dependency for biomedical image segmentation

J Ma, F Li, B Wang - arXiv preprint arXiv:2401.04722, 2024 - arxiv.org
Convolutional Neural Networks (CNNs) and Transformers have been the most popular
architectures for biomedical image segmentation, but both of them have limited ability to …

Segment anything is not always perfect: An investigation of sam on different real-world applications

W Ji, J Li, Q Bi, T Liu, W Li, L Cheng - 2024 - Springer
Abstract Recently, Meta AI Research approaches a general, promptable segment anything
model (SAM) pre-trained on an unprecedentedly large segmentation dataset (SA-1B) …

Sam 2: Segment anything in images and videos

N Ravi, V Gabeur, YT Hu, R Hu, C Ryali, T Ma… - arXiv preprint arXiv …, 2024 - arxiv.org
We present Segment Anything Model 2 (SAM 2), a foundation model towards solving
promptable visual segmentation in images and videos. We build a data engine, which …