Advances in medical image analysis with vision transformers: a comprehensive review

R Azad, A Kazerouni, M Heidari, EK Aghdam… - Medical Image …, 2023 - Elsevier
The remarkable performance of the Transformer architecture in natural language processing
has recently also triggered broad interest in Computer Vision. Among other merits …

Star-transformer: a spatio-temporal cross attention transformer for human action recognition

D Ahn, S Kim, H Hong, BC Ko - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
In action recognition, although the combination of spatio-temporal videos and skeleton
features can improve the recognition performance, a separate model and balancing feature …

Swinmm: masked multi-view with swin transformers for 3d medical image segmentation

Y Wang, Z Li, J Mei, Z Wei, L Liu, C Wang… - … Conference on Medical …, 2023 - Springer
Recent advancements in large-scale Vision Transformers have made significant strides in
improving pre-trained models for medical image segmentation. However, these methods …

A survey of neural trees

H Li, J Song, M Xue, H Zhang, J Ye, L Cheng… - arXiv preprint arXiv …, 2022 - arxiv.org
Neural networks (NNs) and decision trees (DTs) are both popular models of machine
learning, yet coming with mutually exclusive advantages and limitations. To bring the best of …

Fine-grained visual classification with high-temperature refinement and background suppression

PY Chou, YY Kao, CH Lin - arXiv preprint arXiv:2303.06442, 2023 - arxiv.org
Fine-grained visual classification is a challenging task due to the high similarity between
categories and distinct differences among data within one single category. To address the …

Protopformer: Concentrating on prototypical parts in vision transformers for interpretable image recognition

M Xue, Q Huang, H Zhang, L Cheng, J Song… - arXiv preprint arXiv …, 2022 - arxiv.org
Prototypical part network (ProtoPNet) has drawn wide attention and boosted many follow-up
studies due to its self-explanatory property for explainable artificial intelligence (XAI) …

Transformers pay attention to convolutions leveraging emerging properties of ViTs by dual attention-image network

Y Yeganeh, A Farshad, P Weinberger… - Proceedings of the …, 2023 - openaccess.thecvf.com
Although purely transformer-based architectures pretrained on large datasets are introduced
as foundation models for general computer vision tasks, hybrid models that incorporate …

Learning support and trivial prototypes for interpretable image classification

C Wang, Y Liu, Y Chen, F Liu, Y Tian… - Proceedings of the …, 2023 - openaccess.thecvf.com
Prototypical part network (ProtoPNet) methods have been designed to achieve interpretable
classification by associating predictions with a set of training prototypes, which we refer to as …

Pixel-grounded prototypical part networks

Z Carmichael, S Lohit, A Cherian… - Proceedings of the …, 2024 - openaccess.thecvf.com
Prototypical part neural networks (ProtoPartNNs), namely ProtoPNet and its derivatives, are
an intrinsically interpretable approach to machine learning. Their prototype learning scheme …

FET-FGVC: Feature-enhanced transformer for fine-grained visual classification

H Chen, H Zhang, C Liu, J An, Z Gao, J Qiu - Pattern Recognition, 2024 - Elsevier
The challenge of Fine-grained visual classification (FGVC) comes from the small variations
between classes and the large variations within classes. Inspired by the fact that identifying …