On the challenges and perspectives of foundation models for medical image analysis

S Zhang, D Metaxas - Medical image analysis, 2024 - Elsevier
This article discusses the opportunities, applications and future directions of large-scale
pretrained models, ie, foundation models, which promise to significantly improve the …

Foundational models defining a new era in vision: A survey and outlook

M Awais, M Naseer, S Khan, RM Anwer… - arXiv preprint arXiv …, 2023 - arxiv.org
Vision systems to see and reason about the compositional nature of visual scenes are
fundamental to understanding our world. The complex relations between objects and their …

Segment anything is not always perfect: An investigation of sam on different real-world applications

W Ji, J Li, Q Bi, T Liu, W Li, L Cheng - 2024 - Springer
Abstract Recently, Meta AI Research approaches a general, promptable segment anything
model (SAM) pre-trained on an unprecedentedly large segmentation dataset (SA-1B) …

Polyp-pvt: Polyp segmentation with pyramid vision transformers

B Dong, W Wang, DP Fan, J Li, H Fu, L Shao - arXiv preprint arXiv …, 2021 - arxiv.org
Most polyp segmentation methods use CNNs as their backbone, leading to two key issues
when exchanging information between the encoder and decoder: 1) taking into account the …

Camouflaged object detection via context-aware cross-level fusion

G Chen, SJ Liu, YJ Sun, GP Ji, YF Wu… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Camouflaged object detection (COD) aims to identify the objects that conceal themselves in
natural scenes. Accurate COD suffers from a number of challenges associated with low …

Deep gradient learning for efficient camouflaged object detection

GP Ji, DP Fan, YC Chou, D Dai, A Liniger… - Machine Intelligence …, 2023 - Springer
This paper introduces deep gradient network (DGNet), a novel deep framework that exploits
object gradient supervision for camouflaged object detection (COD). It decouples the task …

See more and know more: Zero-shot point cloud segmentation via multi-modal visual data

Y Lu, Q Jiang, R Chen, Y Hou… - Proceedings of the …, 2023 - openaccess.thecvf.com
Zero-shot point cloud segmentation aims to make deep models capable of recognizing
novel objects in point cloud that are unseen in the training phase. Recent trends favor the …

Tall: Thumbnail layout for deepfake video detection

Y Xu, J Liang, G Jia, Z Yang… - Proceedings of the …, 2023 - openaccess.thecvf.com
The growing threats of deepfakes to society and cybersecurity have raised enormous public
concerns, and increasing efforts have been devoted to this critical topic of deepfake video …

Advances in deep concealed scene understanding

DP Fan, GP Ji, P Xu, MM Cheng, C Sakaridis… - Visual Intelligence, 2023 - Springer
Concealed scene understanding (CSU) is a hot computer vision topic aiming to perceive
objects exhibiting camouflage. The current boom in terms of techniques and applications …

Uniseg: A unified multi-modal lidar segmentation network and the openpcseg codebase

Y Liu, R Chen, X Li, L Kong, Y Yang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Point-, voxel-, and range-views are three representative forms of point clouds. All of
them have accurate 3D measurements but lack color and texture information. RGB images …