Exploring plain vision transformer backbones for object detection

Y Li, H Mao, R Girshick, K He - European conference on computer vision, 2022 - Springer
We explore the plain, non-hierarchical Vision Transformer (ViT) as a backbone network for
object detection. This design enables the original ViT architecture to be fine-tuned for object …

Exploring Plain Vision Transformer Backbones for Object Detection

Y Li, H Mao, R Girshick, K He - European Conference on Computer …, 2022 - dl.acm.org
We explore the plain, non-hierarchical Vision Transformer (ViT) as a backbone network for
object detection. This design enables the original ViT architecture to be fine-tuned for object …

Exploring Plain Vision Transformer Backbones for Object Detection

Y Li, H Mao, R Girshick, K He - arXiv preprint arXiv:2203.16527, 2022 - arxiv.org
We explore the plain, non-hierarchical Vision Transformer (ViT) as a backbone network for
object detection. This design enables the original ViT architecture to be fine-tuned for object …

[PDF][PDF] Exploring Plain Vision Transformer Backbones for Object Detection

YLH Mao, R Girshick, K He - ecva.net
We explore the plain, non-hierarchical Vision Transformer (ViT) as a backbone network for
object detection. This design enables the original ViT architecture to be fine-tuned for object …

Exploring Plain Vision Transformer Backbones for Object Detection

Y Li, H Mao, R Girshick, K He - arXiv e-prints, 2022 - ui.adsabs.harvard.edu
We explore the plain, non-hierarchical Vision Transformer (ViT) as a backbone network for
object detection. This design enables the original ViT architecture to be fine-tuned for object …

[PDF][PDF] Exploring Plain Vision Transformer Backbones for Object Detection

YLH Mao, R Girshick, K He - arxiv.org
We explore the plain, non-hierarchical Vision Transformer (ViT) as a backbone network for
object detection. This design enables the original ViT architecture to be fine-tuned for object …