Advances in medical image analysis with vision transformers: a comprehensive review

R Azad, A Kazerouni, M Heidari, EK Aghdam… - Medical Image …, 2023 - Elsevier
The remarkable performance of the Transformer architecture in natural language processing
has recently also triggered broad interest in Computer Vision. Among other merits …

Efficientformer: Vision transformers at mobilenet speed

Y Li, G Yuan, Y Wen, J Hu… - Advances in …, 2022 - proceedings.neurips.cc
Abstract Vision Transformers (ViT) have shown rapid progress in computer vision tasks,
achieving promising results on various benchmarks. However, due to the massive number of …

Rethinking vision transformers for mobilenet size and speed

Y Li, J Hu, Y Wen, G Evangelidis… - Proceedings of the …, 2023 - openaccess.thecvf.com
With the success of Vision Transformers (ViTs) in computer vision tasks, recent arts try to
optimize the performance and complexity of ViTs to enable efficient deployment on mobile …

Next-vit: Next generation vision transformer for efficient deployment in realistic industrial scenarios

J Li, X Xia, W Li, H Li, X Wang, X Xiao, R Wang… - arXiv preprint arXiv …, 2022 - arxiv.org
Due to the complex attention mechanisms and model design, most existing vision
Transformers (ViTs) can not perform as efficiently as convolutional neural networks (CNNs) …

Swin3d: A pretrained transformer backbone for 3d indoor scene understanding

YQ Yang, YX Guo, JY Xiong, Y Liu, H Pan… - arXiv preprint arXiv …, 2023 - arxiv.org
The use of pretrained backbones with fine-tuning has been successful for 2D vision and
natural language processing tasks, showing advantages over task-specific networks. In this …

Elasticvit: Conflict-aware supernet training for deploying fast vision transformer on diverse mobile devices

C Tang, LL Zhang, H Jiang, J Xu… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Neural Architecture Search (NAS) has shown promising performance in the
automatic design of vision transformers (ViT) exceeding 1G FLOPs. However, designing …

A cnn-transformer hybrid model based on cswin transformer for uav image object detection

W Lu, C Lan, C Niu, W Liu, L Lyu… - IEEE Journal of …, 2023 - ieeexplore.ieee.org
The object detection of unmanned aerial vehicle (UAV) images has widespread applications
in numerous fields; however, the complex background, diverse scales, and uneven …

Light-YOLOv5: A lightweight algorithm for improved YOLOv5 in complex fire scenarios

H Xu, B Li, F Zhong - Applied Sciences, 2022 - mdpi.com
Fire-detection technology is of great importance for successful fire-prevention measures.
Image-based fire detection is one effective method. At present, object-detection algorithms …

SDBAD-Net: A spatial dual-branch attention dehazing network based on meta-former paradigm

G Zhang, W Fang, Y Zheng… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Image dehazing is an emblematical low-level vision task that aims at restoring haze-free
images from haze images. Recently, some methods adopts deep learning techniques to …

TRT-ViT: TensorRT-oriented vision transformer

X Xia, J Li, J Wu, X Wang, X Xiao, M Zheng… - arXiv preprint arXiv …, 2022 - arxiv.org
We revisit the existing excellent Transformers from the perspective of practical application.
Most of them are not even as efficient as the basic ResNets series and deviate from the …