- 学术资源搜索

Vision transformers for dense prediction: A survey

S Zuo, Y Xiao, X Chang, X Wang - Knowledge-Based Systems, 2022 - Elsevier

Transformers have demonstrated impressive expressiveness and transfer capability in
computer vision fields. Dense prediction is a fundamental problem in computer vision that is …

被引用次数：44 相关文章所有 3 个版本

[PDF] neurips.cc

Vitpose: Simple vision transformer baselines for human pose estimation

Y Xu, J Zhang, Q Zhang, D Tao - Advances in Neural …, 2022 - proceedings.neurips.cc

Although no specific domain knowledge is considered in the design, plain vision
transformers have shown excellent performance in visual recognition tasks. However, little …

被引用次数：567 相关文章所有 5 个版本

[PDF] arxiv.org

Edgenext: efficiently amalgamated cnn-transformer architecture for mobile vision applications

M Maaz, A Shaker, H Cholakkal, S Khan… - European conference on …, 2022 - Springer

In the pursuit of achieving ever-increasing accuracy, large and complex neural networks are
usually developed. Such models demand high computational resources and therefore …

被引用次数：218 相关文章所有 8 个版本

[PDF] mdpi.com

SwinBTS: A method for 3D multimodal brain tumor segmentation using swin transformer

Y Jiang, Y Zhang, X Lin, J Dong, T Cheng, J Liang - Brain sciences, 2022 - mdpi.com

Brain tumor semantic segmentation is a critical medical image processing work, which aids
clinicians in diagnosing patients and determining the extent of lesions. Convolutional neural …

被引用次数：895 相关文章所有 12 个版本

[PDF] ieee.org

Transformer meets remote sensing video detection and tracking: A comprehensive survey

L Jiao, X Zhang, X Liu, F Liu, S Yang… - IEEE Journal of …, 2023 - ieeexplore.ieee.org

Transformer has shown excellent performance in remote sensing field with long-range
modeling capabilities. Remote sensing video (RSV) moving object detection and tracking …

被引用次数：19 相关文章所有 2 个版本

[PDF] arxiv.org

KVT: k-NN Attention for Boosting Vision Transformers

P Wang, X Wang, F Wang, M Lin, S Chang, H Li… - European conference on …, 2022 - Springer

Abstract Convolutional Neural Networks (CNNs) have dominated computer vision for years,
due to its ability in capturing locality and translation invariance. Recently, many vision …

被引用次数：111 相关文章所有 6 个版本

[PDF] neurips.cc

Vtc-lfc: Vision transformer compression with low-frequency components

Z Wang, H Luo, P Wang, F Ding… - Advances in Neural …, 2022 - proceedings.neurips.cc

Abstract Although Vision transformers (ViTs) have recently dominated many vision tasks,
deploying ViT models on resource-limited devices remains a challenging problem. To …

被引用次数：26 相关文章所有 4 个版本

[PDF] tum.de

Revitalizing convolutional network for image restoration

Y Cui, W Ren, X Cao, A Knoll - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org

Image restoration aims to reconstruct a high-quality image from its corrupted version, playing
essential roles in many scenarios. Recent years have witnessed a paradigm shift in image …

被引用次数：10 相关文章所有 7 个版本

[PDF] thecvf.com

Making vision transformers efficient from a token sparsification view

S Chang, P Wang, M Lin, F Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com

The quadratic computational complexity to the number of tokens limits the practical
applications of Vision Transformers (ViTs). Several works propose to prune redundant …

被引用次数：23 相关文章所有 5 个版本

Vitpose++: Vision transformer for generic body pose estimation

Y Xu, J Zhang, Q Zhang, D Tao - IEEE Transactions on Pattern …, 2023 - ieeexplore.ieee.org

In this paper, we show the surprisingly good properties of plain vision transformers for body
pose estimation from various aspects, namely simplicity in model structure, scalability in …

被引用次数：30 相关文章所有 7 个版本