所有版本 - 学术资源搜索

Multiscale vision transformers

H Fan, B Xiong, K Mangalam, Y Li… - Proceedings of the …, 2021 - openaccess.thecvf.com

Abstract We present Multiscale Vision Transformers (MViT) for video and image recognition,
by connecting the seminal idea of multiscale feature hierarchies with transformer models …

被引用次数：1544 相关文章

Multiscale Vision Transformers

H Fan, B Xiong, K Mangalam, Y Li… - 2021 IEEE/CVF …, 2021 - ieeexplore.ieee.org

We present Multiscale Vision Transformers (MViT) for video and image recognition, by
connecting the seminal idea of multiscale feature hierarchies with transformer models …

An Empirical Study of Training Self-Supervised Vision Transformers

X Chen, S Xie, K He - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com

This paper does not describe a novel method. Instead, it studies a straightforward,
incremental, yet must-know baseline given the recent progress in computer vision: self …

Multiscale Vision Transformers

H Fan, B Xiong, K Mangalam, Y Li, Z Yan… - arXiv preprint arXiv …, 2021 - arxiv.org

We present Multiscale Vision Transformers (MViT) for video and image recognition, by
connecting the seminal idea of multiscale feature hierarchies with transformer models …

An Empirical Study of Training Self-Supervised Vision Transformers

X Chen, S Xie, K He - 18th IEEE/CVF International Conference …, 2021 - nyuscholars.nyu.edu

This paper does not describe a novel method. Instead, it studies a straightforward,
incremental, yet must-know baseline given the recent progress in computer vision: self …

Multiscale Vision Transformers

H Fan, B Xiong, K Mangalam, Y Li, Z Yan… - 2021 IEEE/CVF …, 2021 - computer.org

Abstract We present Multiscale Vision Transformers (MViT) for video and image recognition,
by connecting the seminal idea of multiscale feature hierarchies with transformer models …

Multiscale Vision Transformers

H Fan, B Xiong, K Mangalam, Y Li, Z Yan… - arXiv e …, 2021 - ui.adsabs.harvard.edu

Abstract We present Multiscale Vision Transformers (MViT) for video and image recognition,
by connecting the seminal idea of multiscale feature hierarchies with transformer models …

An Empirical Study of Training Self-Supervised Vision Transformers

X Chen, S Xie, K He - arXiv e-prints, 2021 - ui.adsabs.harvard.edu

This paper does not describe a novel method. Instead, it studies a straightforward,
incremental, yet must-know baseline given the recent progress in computer vision: self …

An Empirical Study of Training Self-Supervised Vision Transformers

X Chen, S Xie, K He - 2021 IEEE/CVF International Conference on …, 2021 - computer.org

This paper does not describe a novel method. Instead, it studies a straightforward,
incremental, yet must-know baseline given the recent progress in computer vision: self …

An Empirical Study of Training Self-Supervised Vision Transformers

X Chen, S Xie, K He - arXiv preprint arXiv:2104.02057, 2021 - arxiv.org

This paper does not describe a novel method. Instead, it studies a straightforward,
incremental, yet must-know baseline given the recent progress in computer vision: self …