A survey on visual transformer

C Zhou, Q Li, C Li, J Yu, Y Liu, G Wang… - arXiv preprint arXiv …, 2023 - arxiv.org

Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks with different data modalities. A PFM (eg, BERT, ChatGPT, and GPT-4) is …

被引用次数：397 相关文章所有 2 个版本

[PDF] arxiv.org

Transformers in medical imaging: A survey

F Shamshad, S Khan, SW Zamir, MH Khan… - Medical Image …, 2023 - Elsevier

Following unprecedented success on the natural language tasks, Transformers have been
successfully applied to several computer vision problems, achieving state-of-the-art results …

被引用次数：488 相关文章所有 9 个版本

[PDF] arxiv.org

Transformers in time series: A survey

Q Wen, T Zhou, C Zhang, W Chen, Z Ma, J Yan… - arXiv preprint arXiv …, 2022 - arxiv.org

Transformers have achieved superior performances in many tasks in natural language
processing and computer vision, which also triggered great interest in the time series …

被引用次数：567 相关文章所有 6 个版本

[PDF] springer.com

Attention mechanisms in computer vision: A survey

MH Guo, TX Xu, JJ Liu, ZN Liu, PT Jiang, TJ Mu… - Computational visual …, 2022 - Springer

Humans can naturally and effectively find salient regions in complex scenes. Motivated by
this observation, attention mechanisms were introduced into computer vision with the aim of …

被引用次数：1347 相关文章所有 8 个版本

[PDF] thecvf.com

Point-bert: Pre-training 3d point cloud transformers with masked point modeling

X Yu, L Tang, Y Rao, T Huang… - Proceedings of the …, 2022 - openaccess.thecvf.com

We present Point-BERT, a novel paradigm for learning Transformers to generalize the
concept of BERT onto 3D point cloud. Following BERT, we devise a Masked Point Modeling …

被引用次数：491 相关文章所有 6 个版本

[HTML] sciencedirect.com

[HTML][HTML] Transformers in medical image analysis

K He, C Gan, Z Li, I Rekik, Z Yin, W Ji, Y Gao, Q Wang… - Intelligent …, 2023 - Elsevier

Transformers have dominated the field of natural language processing and have recently
made an impact in the area of computer vision. In the field of medical image analysis …

被引用次数：233 相关文章所有 12 个版本

[PDF] neurips.cc

Coatnet: Marrying convolution and attention for all data sizes

Z Dai, H Liu, QV Le, M Tan - Advances in neural information …, 2021 - proceedings.neurips.cc

Transformers have attracted increasing interests in computer vision, but they still fall behind
state-of-the-art convolutional networks. In this work, we show that while Transformers tend to …

被引用次数：1131 相关文章所有 9 个版本

[HTML] sciencedirect.com

[HTML][HTML] A survey of transformers

T Lin, Y Wang, X Liu, X Qiu - AI open, 2022 - Elsevier

Transformers have achieved great success in many artificial intelligence fields, such as
natural language processing, computer vision, and audio processing. Therefore, it is natural …

被引用次数：991 相关文章所有 4 个版本

[PDF] mdpi.com

A survey of visual transformers

Y Liu, Y Zhang, Y Wang, F Hou, J Yuan… - … on Neural Networks …, 2023 - ieeexplore.ieee.org

Transformer, an attention-based encoder–decoder model, has already revolutionized the
field of natural language processing (NLP). Inspired by such significant achievements, some …

被引用次数：298 相关文章所有 22 个版本

[PDF] neurips.cc

Focal modulation networks

J Yang, C Li, X Dai, J Gao - Advances in Neural Information …, 2022 - proceedings.neurips.cc

We propose focal modulation networks (FocalNets in short), where self-attention (SA) is
completely replaced by a focal modulation module for modeling token interactions in vision …

被引用次数：154 相关文章所有 6 个版本

A comprehensive survey on pretrained foundation models: A history from bert to chatgpt