Vision transformers for dense prediction: A survey
S Zuo, Y Xiao, X Chang, X Wang - Knowledge-Based Systems, 2022 - Elsevier
Transformers have demonstrated impressive expressiveness and transfer capability in
computer vision fields. Dense prediction is a fundamental problem in computer vision that is …
computer vision fields. Dense prediction is a fundamental problem in computer vision that is …
Vitpose: Simple vision transformer baselines for human pose estimation
Although no specific domain knowledge is considered in the design, plain vision
transformers have shown excellent performance in visual recognition tasks. However, little …
transformers have shown excellent performance in visual recognition tasks. However, little …
Edgenext: efficiently amalgamated cnn-transformer architecture for mobile vision applications
In the pursuit of achieving ever-increasing accuracy, large and complex neural networks are
usually developed. Such models demand high computational resources and therefore …
usually developed. Such models demand high computational resources and therefore …
SwinBTS: A method for 3D multimodal brain tumor segmentation using swin transformer
Y Jiang, Y Zhang, X Lin, J Dong, T Cheng, J Liang - Brain sciences, 2022 - mdpi.com
Brain tumor semantic segmentation is a critical medical image processing work, which aids
clinicians in diagnosing patients and determining the extent of lesions. Convolutional neural …
clinicians in diagnosing patients and determining the extent of lesions. Convolutional neural …
Transformer meets remote sensing video detection and tracking: A comprehensive survey
Transformer has shown excellent performance in remote sensing field with long-range
modeling capabilities. Remote sensing video (RSV) moving object detection and tracking …
modeling capabilities. Remote sensing video (RSV) moving object detection and tracking …
KVT: k-NN Attention for Boosting Vision Transformers
Abstract Convolutional Neural Networks (CNNs) have dominated computer vision for years,
due to its ability in capturing locality and translation invariance. Recently, many vision …
due to its ability in capturing locality and translation invariance. Recently, many vision …
Vtc-lfc: Vision transformer compression with low-frequency components
Abstract Although Vision transformers (ViTs) have recently dominated many vision tasks,
deploying ViT models on resource-limited devices remains a challenging problem. To …
deploying ViT models on resource-limited devices remains a challenging problem. To …
Revitalizing convolutional network for image restoration
Image restoration aims to reconstruct a high-quality image from its corrupted version, playing
essential roles in many scenarios. Recent years have witnessed a paradigm shift in image …
essential roles in many scenarios. Recent years have witnessed a paradigm shift in image …
Making vision transformers efficient from a token sparsification view
The quadratic computational complexity to the number of tokens limits the practical
applications of Vision Transformers (ViTs). Several works propose to prune redundant …
applications of Vision Transformers (ViTs). Several works propose to prune redundant …
Vitpose++: Vision transformer for generic body pose estimation
In this paper, we show the surprisingly good properties of plain vision transformers for body
pose estimation from various aspects, namely simplicity in model structure, scalability in …
pose estimation from various aspects, namely simplicity in model structure, scalability in …