Multiscale vision transformers

H Fan, B Xiong, K Mangalam, Y Li… - Proceedings of the …, 2021 - openaccess.thecvf.com
Abstract We present Multiscale Vision Transformers (MViT) for video and image recognition,
by connecting the seminal idea of multiscale feature hierarchies with transformer models …

Multiscale Vision Transformers

H Fan, B Xiong, K Mangalam, Y Li… - 2021 IEEE/CVF …, 2021 - ieeexplore.ieee.org
We present Multiscale Vision Transformers (MViT) for video and image recognition, by
connecting the seminal idea of multiscale feature hierarchies with transformer models …

An Empirical Study of Training Self-Supervised Vision Transformers

X Chen, S Xie, K He - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com
This paper does not describe a novel method. Instead, it studies a straightforward,
incremental, yet must-know baseline given the recent progress in computer vision: self …

Multiscale Vision Transformers

H Fan, B Xiong, K Mangalam, Y Li, Z Yan… - arXiv preprint arXiv …, 2021 - arxiv.org
We present Multiscale Vision Transformers (MViT) for video and image recognition, by
connecting the seminal idea of multiscale feature hierarchies with transformer models …

An Empirical Study of Training Self-Supervised Vision Transformers

X Chen, S Xie, K He - 18th IEEE/CVF International Conference …, 2021 - nyuscholars.nyu.edu
This paper does not describe a novel method. Instead, it studies a straightforward,
incremental, yet must-know baseline given the recent progress in computer vision: self …

Multiscale Vision Transformers

H Fan, B Xiong, K Mangalam, Y Li, Z Yan… - 2021 IEEE/CVF …, 2021 - computer.org
Abstract We present Multiscale Vision Transformers (MViT) for video and image recognition,
by connecting the seminal idea of multiscale feature hierarchies with transformer models …

Multiscale Vision Transformers

H Fan, B Xiong, K Mangalam, Y Li, Z Yan… - arXiv e …, 2021 - ui.adsabs.harvard.edu
Abstract We present Multiscale Vision Transformers (MViT) for video and image recognition,
by connecting the seminal idea of multiscale feature hierarchies with transformer models …

An Empirical Study of Training Self-Supervised Vision Transformers

X Chen, S Xie, K He - arXiv e-prints, 2021 - ui.adsabs.harvard.edu
This paper does not describe a novel method. Instead, it studies a straightforward,
incremental, yet must-know baseline given the recent progress in computer vision: self …

An Empirical Study of Training Self-Supervised Vision Transformers

X Chen, S Xie, K He - 2021 IEEE/CVF International Conference on …, 2021 - computer.org
This paper does not describe a novel method. Instead, it studies a straightforward,
incremental, yet must-know baseline given the recent progress in computer vision: self …

An Empirical Study of Training Self-Supervised Vision Transformers

X Chen, S Xie, K He - arXiv preprint arXiv:2104.02057, 2021 - arxiv.org
This paper does not describe a novel method. Instead, it studies a straightforward,
incremental, yet must-know baseline given the recent progress in computer vision: self …