A survey of the vision transformers and their CNN-transformer based variants

A Khan, Z Rauf, A Sohail, AR Khan, H Asif… - Artificial Intelligence …, 2023 - Springer
Vision transformers have become popular as a possible substitute to convolutional neural
networks (CNNs) for a variety of computer vision applications. These transformers, with their …

One novel transfer learning-based CLIP model combined with self-attention mechanism for differentiating the tumor-stroma ratio in pancreatic ductal adenocarcinoma

H Liao, J Yuan, C Liu, J Zhang, Y Yang, H Liang… - La radiologia …, 2024 - Springer
Purpose To develop a contrastive language-image pretraining (CLIP) model based on
transfer learning and combined with self-attention mechanism to predict the tumor-stroma …

Towards efficient task-driven model reprogramming with foundation models

S Xu, J Yao, R Luo, S Zhang, Z Lian, M Tan… - arXiv preprint arXiv …, 2023 - arxiv.org
Vision foundation models exhibit impressive power, benefiting from the extremely large
model capacity and broad training data. However, in practice, downstream scenarios may …

Large coordinate kernel attention network for lightweight image super-resolution

F Hao, J Wu, H Lu, J Du, J Xu, X Xu - arXiv preprint arXiv:2405.09353, 2024 - arxiv.org
The multi-scale receptive field and large kernel attention (LKA) module have been shown to
significantly improve performance in the lightweight image super-resolution task. However …