相关文章- 学术资源搜索

A convnet for the 2020s

Z Liu, H Mao, CY Wu, C Feichtenhofer… - Proceedings of the …, 2022 - openaccess.thecvf.com

The" Roaring 20s" of visual recognition began with the introduction of Vision Transformers
(ViTs), which quickly superseded ConvNets as the state-of-the-art image classification …

被引用次数：4886 相关文章所有 11 个版本

[PDF] arxiv.org

Conv2former: A simple transformer-style convnet for visual recognition

Q Hou, CZ Lu, MM Cheng… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Vision Transformers have been the most popular network architecture in visual recognition
recently due to the strong ability of encode global information. However, its high …

被引用次数：87 相关文章所有 7 个版本

[PDF] thecvf.com

Convnext v2: Co-designing and scaling convnets with masked autoencoders

S Woo, S Debnath, R Hu, X Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com

Driven by improved architectures and better representation learning frameworks, the field of
visual recognition has enjoyed rapid modernization and performance boost in the early …

被引用次数：369 相关文章所有 8 个版本

[PDF] thecvf.com

Visformer: The vision-friendly transformer

Z Chen, L Xie, J Niu, X Liu, L Wei… - Proceedings of the …, 2021 - openaccess.thecvf.com

The past year has witnessed the rapid development of applying the Transformer module to
vision problems. While some researchers have demonstrated that Transformer-based …

被引用次数：210 相关文章所有 6 个版本

[PDF] arxiv.org

Lightvit: Towards light-weight convolution-free vision transformers

T Huang, L Huang, S You, F Wang, C Qian… - arXiv preprint arXiv …, 2022 - arxiv.org

Vision transformers (ViTs) are usually considered to be less light-weight than convolutional
neural networks (CNNs) due to the lack of inductive bias. Recent works thus resort to …

被引用次数：55 相关文章所有 3 个版本

[PDF] thecvf.com

Convnets vs. transformers: Whose visual representations are more transferable?

HY Zhou, C Lu, S Yang, Y Yu - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com

Vision transformers have attracted much attention from computer vision researchers as they
are not restricted to the spatial inductive bias of ConvNets. However, although Transformer …

被引用次数：57 相关文章所有 9 个版本

[PDF] thecvf.com

Understanding robustness of transformers for image classification

S Bhojanapalli, A Chakrabarti… - Proceedings of the …, 2021 - openaccess.thecvf.com

Abstract Deep Convolutional Neural Networks (CNNs) have long been the architecture of
choice for computer vision tasks. Recently, Transformer-based architectures like Vision …

被引用次数：397 相关文章所有 8 个版本

[PDF] arxiv.org

Volo: Vision outlooker for visual recognition

L Yuan, Q Hou, Z Jiang, J Feng… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Recently, Vision Transformers (ViTs) have been broadly explored in visual recognition. With
low efficiency in encoding fine-level features, the performance of ViTs is still inferior to the …

被引用次数：294 相关文章所有 7 个版本

[PDF] thecvf.com

Cmt: Convolutional neural networks meet vision transformers

J Guo, K Han, H Wu, Y Tang, X Chen… - Proceedings of the …, 2022 - openaccess.thecvf.com

Vision transformers have been successfully applied to image recognition tasks due to their
ability to capture long-range dependencies within an image. However, there are still gaps in …

被引用次数：656 相关文章所有 6 个版本

[PDF] thecvf.com

Msg-transformer: Exchanging local spatial information by manipulating messenger tokens

J Fang, L Xie, X Wang, X Zhang… - Proceedings of the …, 2022 - openaccess.thecvf.com

Transformers have offered a new methodology of designing neural networks for visual
recognition. Compared to convolutional networks, Transformers enjoy the ability of referring …

被引用次数：79 相关文章所有 6 个版本