A survey of techniques for optimizing transformer inference

KT Chitty-Venkata, S Mittal, M Emani… - Journal of Systems …, 2023 - Elsevier
Recent years have seen a phenomenal rise in the performance and applications of
transformer neural networks. The family of transformer networks, including Bidirectional …

Vision transformers for dense prediction: A survey

S Zuo, Y Xiao, X Chang, X Wang - Knowledge-Based Systems, 2022 - Elsevier
Transformers have demonstrated impressive expressiveness and transfer capability in
computer vision fields. Dense prediction is a fundamental problem in computer vision that is …

Segment anything is not always perfect: An investigation of sam on different real-world applications

W Ji, J Li, Q Bi, T Liu, W Li, L Cheng - 2024 - Springer
Abstract Recently, Meta AI Research approaches a general, promptable segment anything
model (SAM) pre-trained on an unprecedentedly large segmentation dataset (SA-1B) …

SegFormer: Simple and efficient design for semantic segmentation with transformers

E Xie, W Wang, Z Yu, A Anandkumar… - Advances in neural …, 2021 - proceedings.neurips.cc
We present SegFormer, a simple, efficient yet powerful semantic segmentation framework
which unifies Transformers with lightweight multilayer perceptron (MLP) decoders …

Transformer in transformer

K Han, A Xiao, E Wu, J Guo, C Xu… - Advances in neural …, 2021 - proceedings.neurips.cc
Transformer is a new kind of neural architecture which encodes the input data as powerful
features via the attention mechanism. Basically, the visual transformers first divide the input …

Pyramid vision transformer: A versatile backbone for dense prediction without convolutions

W Wang, E Xie, X Li, DP Fan, K Song… - Proceedings of the …, 2021 - openaccess.thecvf.com
Although convolutional neural networks (CNNs) have achieved great success in computer
vision, this work investigates a simpler, convolution-free backbone network useful for many …

Transreid: Transformer-based object re-identification

S He, H Luo, P Wang, F Wang, H Li… - Proceedings of the …, 2021 - openaccess.thecvf.com
Extracting robust feature representation is one of the key challenges in object re-
identification (ReID). Although convolution neural network (CNN)-based methods have …

Transfg: A transformer architecture for fine-grained recognition

J He, JN Chen, S Liu, A Kortylewski, C Yang… - Proceedings of the …, 2022 - ojs.aaai.org
Fine-grained visual classification (FGVC) which aims at recognizing objects from
subcategories is a very challenging task due to the inherently subtle inter-class differences …

Multi-compound transformer for accurate biomedical image segmentation

Y Ji, R Zhang, H Wang, Z Li, L Wu, S Zhang… - … Image Computing and …, 2021 - Springer
The recent vision transformer (ie for image classification) learns non-local attentive
interaction of different patch tokens. However, prior arts miss learning the cross-scale …

Dex-NeRF: Using a neural radiance field to grasp transparent objects

J Ichnowski, Y Avigal, J Kerr, K Goldberg - arXiv preprint arXiv:2110.14217, 2021 - arxiv.org
The ability to grasp and manipulate transparent objects is a major challenge for robots.
Existing depth cameras have difficulty detecting, localizing, and inferring the geometry of …