Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives
Transformer, one of the latest technological advances of deep learning, has gained
prevalence in natural language processing or computer vision. Since medical imaging bear …
prevalence in natural language processing or computer vision. Since medical imaging bear …
Federated vehicular transformers and their federations: Privacy-preserving computing and cooperation for autonomous driving
Cooperative computing is promising to enhance the performance and safety of autonomous
vehicles benefiting from the increase in the amount, diversity as well as scope of data …
vehicles benefiting from the increase in the amount, diversity as well as scope of data …
Crossformer: Transformer utilizing cross-dimension dependency for multivariate time series forecasting
Recently many deep models have been proposed for multivariate time series (MTS)
forecasting. In particular, Transformer-based models have shown great potential because …
forecasting. In particular, Transformer-based models have shown great potential because …
Vision gnn: An image is worth graph of nodes
Network architecture plays a key role in the deep learning-based computer vision system.
The widely-used convolutional neural network and transformer treat the image as a grid or …
The widely-used convolutional neural network and transformer treat the image as a grid or …
Petr: Position embedding transformation for multi-view 3d object detection
In this paper, we develop position embedding transformation (PETR) for multi-view 3D
object detection. PETR encodes the position information of 3D coordinates into image …
object detection. PETR encodes the position information of 3D coordinates into image …
Stratified transformer for 3d point cloud segmentation
Abstract 3D point cloud segmentation has made tremendous progress in recent years. Most
current methods focus on aggregating local features, but fail to directly model long-range …
current methods focus on aggregating local features, but fail to directly model long-range …
Metaformer is actually what you need for vision
Transformers have shown great potential in computer vision tasks. A common belief is their
attention-based token mixer module contributes most to their competence. However, recent …
attention-based token mixer module contributes most to their competence. However, recent …
Flexible diffusion modeling of long videos
We present a framework for video modeling based on denoising diffusion probabilistic
models that produces long-duration video completions in a variety of realistic environments …
models that produces long-duration video completions in a variety of realistic environments …
Davit: Dual attention vision transformers
In this work, we introduce Dual Attention Vision Transformers (DaViT), a simple yet effective
vision transformer architecture that is able to capture global context while maintaining …
vision transformer architecture that is able to capture global context while maintaining …
Tinyvit: Fast pretraining distillation for small vision transformers
Vision transformer (ViT) recently has drawn great attention in computer vision due to its
remarkable model capability. However, most prevailing ViT models suffer from huge number …
remarkable model capability. However, most prevailing ViT models suffer from huge number …