所有版本 - 学术资源搜索

Exploring plain vision transformer backbones for object detection

Y Li, H Mao, R Girshick, K He - European conference on computer vision, 2022 - Springer

We explore the plain, non-hierarchical Vision Transformer (ViT) as a backbone network for
object detection. This design enables the original ViT architecture to be fine-tuned for object …

被引用次数：619 相关文章

Exploring Plain Vision Transformer Backbones for Object Detection

Y Li, H Mao, R Girshick, K He - European Conference on Computer …, 2022 - dl.acm.org

We explore the plain, non-hierarchical Vision Transformer (ViT) as a backbone network for
object detection. This design enables the original ViT architecture to be fine-tuned for object …

Exploring Plain Vision Transformer Backbones for Object Detection

Y Li, H Mao, R Girshick, K He - arXiv preprint arXiv:2203.16527, 2022 - arxiv.org

We explore the plain, non-hierarchical Vision Transformer (ViT) as a backbone network for
object detection. This design enables the original ViT architecture to be fine-tuned for object …

[PDF] ecva.net

[PDF][PDF] Exploring Plain Vision Transformer Backbones for Object Detection

YLH Mao, R Girshick, K He - ecva.net

We explore the plain, non-hierarchical Vision Transformer (ViT) as a backbone network for
object detection. This design enables the original ViT architecture to be fine-tuned for object …

Exploring Plain Vision Transformer Backbones for Object Detection

Y Li, H Mao, R Girshick, K He - arXiv e-prints, 2022 - ui.adsabs.harvard.edu

We explore the plain, non-hierarchical Vision Transformer (ViT) as a backbone network for
object detection. This design enables the original ViT architecture to be fine-tuned for object …

[PDF] arxiv.org

[PDF][PDF] Exploring Plain Vision Transformer Backbones for Object Detection

YLH Mao, R Girshick, K He - arxiv.org

We explore the plain, non-hierarchical Vision Transformer (ViT) as a backbone network for
object detection. This design enables the original ViT architecture to be fine-tuned for object …