Transformers in medical imaging: A survey
Following unprecedented success on the natural language tasks, Transformers have been
successfully applied to several computer vision problems, achieving state-of-the-art results …
successfully applied to several computer vision problems, achieving state-of-the-art results …
Attention mechanisms in computer vision: A survey
Humans can naturally and effectively find salient regions in complex scenes. Motivated by
this observation, attention mechanisms were introduced into computer vision with the aim of …
this observation, attention mechanisms were introduced into computer vision with the aim of …
Designing network design strategies through gradient path analysis
Designing a high-efficiency and high-quality expressive network architecture has always
been the most important research topic in the field of deep learning. Most of today's network …
been the most important research topic in the field of deep learning. Most of today's network …
Denseclip: Language-guided dense prediction with context-aware prompting
Recent progress has shown that large-scale pre-training using contrastive image-text pairs
can be a promising alternative for high-quality visual representation learning from natural …
can be a promising alternative for high-quality visual representation learning from natural …
Video swin transformer
The vision community is witnessing a modeling shift from CNNs to Transformers, where pure
Transformer architectures have attained top accuracy on the major video recognition …
Transformer architectures have attained top accuracy on the major video recognition …
Segmenter: Transformer for semantic segmentation
Image segmentation is often ambiguous at the level of individual image patches and
requires contextual information to reach label consensus. In this paper we introduce …
requires contextual information to reach label consensus. In this paper we introduce …
Contextual transformer networks for visual recognition
Transformer with self-attention has led to the revolutionizing of natural language processing
field, and recently inspires the emergence of Transformer-style architecture design with …
field, and recently inspires the emergence of Transformer-style architecture design with …
You only learn one representation: Unified network for multiple tasks
People``understand''the world via vision, hearing, tactile, and also the past experience.
Human experience can be learned through normal learning (we call it explicit knowledge) …
Human experience can be learned through normal learning (we call it explicit knowledge) …
Swin transformer: Hierarchical vision transformer using shifted windows
This paper presents a new vision Transformer, called Swin Transformer, that capably serves
as a general-purpose backbone for computer vision. Challenges in adapting Transformer …
as a general-purpose backbone for computer vision. Challenges in adapting Transformer …
Focal self-attention for local-global interactions in vision transformers
Recently, Vision Transformer and its variants have shown great promise on various
computer vision tasks. The ability of capturing short-and long-range visual dependencies …
computer vision tasks. The ability of capturing short-and long-range visual dependencies …