Vit-net: Interpretable vision transformers with neural tree decoder

R Azad, A Kazerouni, M Heidari, EK Aghdam… - Medical Image …, 2023 - Elsevier

The remarkable performance of the Transformer architecture in natural language processing
has recently also triggered broad interest in Computer Vision. Among other merits …

被引用次数：73 相关文章所有 7 个版本

[PDF] thecvf.com

Star-transformer: a spatio-temporal cross attention transformer for human action recognition

D Ahn, S Kim, H Hong, BC Ko - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

In action recognition, although the combination of spatio-temporal videos and skeleton
features can improve the recognition performance, a separate model and balancing feature …

被引用次数：112 相关文章所有 6 个版本

[PDF] arxiv.org

Swinmm: masked multi-view with swin transformers for 3d medical image segmentation

Y Wang, Z Li, J Mei, Z Wei, L Liu, C Wang… - … Conference on Medical …, 2023 - Springer

Recent advancements in large-scale Vision Transformers have made significant strides in
improving pre-trained models for medical image segmentation. However, these methods …

被引用次数：24 相关文章所有 6 个版本

[PDF] arxiv.org

A survey of neural trees

H Li, J Song, M Xue, H Zhang, J Ye, L Cheng… - arXiv preprint arXiv …, 2022 - arxiv.org

Neural networks (NNs) and decision trees (DTs) are both popular models of machine
learning, yet coming with mutually exclusive advantages and limitations. To bring the best of …

被引用次数：10 相关文章所有 3 个版本

[PDF] arxiv.org

Fine-grained visual classification with high-temperature refinement and background suppression

PY Chou, YY Kao, CH Lin - arXiv preprint arXiv:2303.06442, 2023 - arxiv.org

Fine-grained visual classification is a challenging task due to the high similarity between
categories and distinct differences among data within one single category. To address the …

被引用次数：28 相关文章所有 2 个版本

[PDF] arxiv.org

Protopformer: Concentrating on prototypical parts in vision transformers for interpretable image recognition

M Xue, Q Huang, H Zhang, L Cheng, J Song… - arXiv preprint arXiv …, 2022 - arxiv.org

Prototypical part network (ProtoPNet) has drawn wide attention and boosted many follow-up
studies due to its self-explanatory property for explainable artificial intelligence (XAI) …

被引用次数：39 相关文章所有 3 个版本

[PDF] thecvf.com

Transformers pay attention to convolutions leveraging emerging properties of ViTs by dual attention-image network

Y Yeganeh, A Farshad, P Weinberger… - Proceedings of the …, 2023 - openaccess.thecvf.com

Although purely transformer-based architectures pretrained on large datasets are introduced
as foundation models for general computer vision tasks, hybrid models that incorporate …

被引用次数：5 相关文章所有 4 个版本

[PDF] thecvf.com

Learning support and trivial prototypes for interpretable image classification

C Wang, Y Liu, Y Chen, F Liu, Y Tian… - Proceedings of the …, 2023 - openaccess.thecvf.com

Prototypical part network (ProtoPNet) methods have been designed to achieve interpretable
classification by associating predictions with a set of training prototypes, which we refer to as …

被引用次数：16 相关文章所有 6 个版本

[PDF] thecvf.com

Pixel-grounded prototypical part networks

Z Carmichael, S Lohit, A Cherian… - Proceedings of the …, 2024 - openaccess.thecvf.com

Prototypical part neural networks (ProtoPartNNs), namely ProtoPNet and its derivatives, are
an intrinsically interpretable approach to machine learning. Their prototype learning scheme …

被引用次数：6 相关文章所有 6 个版本

FET-FGVC: Feature-enhanced transformer for fine-grained visual classification

H Chen, H Zhang, C Liu, J An, Z Gao, J Qiu - Pattern Recognition, 2024 - Elsevier

The challenge of Fine-grained visual classification (FGVC) comes from the small variations
between classes and the large variations within classes. Inspired by the fact that identifying …

被引用次数：10 相关文章所有 2 个版本