- 学术资源搜索

Transformers in vision: A survey

S Khan, M Naseer, M Hayat, SW Zamir… - ACM computing …, 2022 - dl.acm.org

Astounding results from Transformer models on natural language tasks have intrigued the
vision community to study their application to computer vision problems. Among their salient …

被引用次数：2495 相关文章所有 8 个版本

[PDF] cell.com Full View

Are we ready for a new paradigm shift? a survey on visual deep mlp

R Liu, Y Li, L Tao, D Liang, HT Zheng - Patterns, 2022 - cell.com

Recently, the proposed deep multilayer perceptron (MLP) models have stirred up a lot of
interest in the vision community. Historically, the availability of larger datasets combined with …

被引用次数：73 相关文章所有 7 个版本

[PDF] thecvf.com

Scaling vision transformers to gigapixel images via hierarchical self-supervised learning

RJ Chen, C Chen, Y Li, TY Chen… - Proceedings of the …, 2022 - openaccess.thecvf.com

Abstract Vision Transformers (ViTs) and their multi-scale and hierarchical variations have
been successful at capturing image representations but their use has been generally …

被引用次数：376 相关文章所有 6 个版本

[PDF] thecvf.com

Learning to prompt for continual learning

Z Wang, Z Zhang, CY Lee, H Zhang… - Proceedings of the …, 2022 - openaccess.thecvf.com

The mainstream paradigm behind continual learning has been to adapt the model
parameters to non-stationary data distributions, where catastrophic forgetting is the central …

被引用次数：608 相关文章所有 8 个版本

[PDF] arxiv.org

Compute trends across three eras of machine learning

J Sevilla, L Heim, A Ho, T Besiroglu… - … Joint Conference on …, 2022 - ieeexplore.ieee.org

Compute, data, and algorithmic advances are the three fundamental factors that drive
progress in modern Machine Learning (ML). In this paper we study trends in the most readily …

被引用次数：291 相关文章所有 4 个版本

[PDF] thecvf.com

Uformer: A general u-shaped transformer for image restoration

Z Wang, X Cun, J Bao, W Zhou… - Proceedings of the …, 2022 - openaccess.thecvf.com

In this paper, we present Uformer, an effective and efficient Transformer-based architecture
for image restoration, in which we build a hierarchical encoder-decoder network using the …

被引用次数：1459 相关文章所有 7 个版本

[PDF] thecvf.com

A generalist framework for panoptic segmentation of images and videos

T Chen, L Li, S Saxena, G Hinton… - Proceedings of the …, 2023 - openaccess.thecvf.com

Panoptic segmentation assigns semantic and instance ID labels to every pixel of an image.
As permutations of instance IDs are also valid solutions, the task requires learning of high …

被引用次数：89 相关文章所有 7 个版本

[PDF] baai.ac.cn

A survey on vision transformer

K Han, Y Wang, H Chen, X Chen, J Guo… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …

被引用次数：2028 相关文章所有 7 个版本

[PDF] um.edu.mo

Hyperspectral image transformer classification networks

X Yang, W Cao, Y Lu, Y Zhou - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Hyperspectral image (HSI) classification is an important task in earth observation missions.
Convolution neural networks (CNNs) with the powerful ability of feature extraction have …

被引用次数：158 相关文章所有 4 个版本

[PDF] arxiv.org

Crossformer++: A versatile vision transformer hinging on cross-scale attention

W Wang, W Chen, Q Qiu, L Chen, B Wu… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org

While features of different scales are perceptually important to visual inputs, existing vision
transformers do not yet take advantage of them explicitly. To this end, we first propose a …

被引用次数：261 相关文章所有 11 个版本