Attention mechanisms in computer vision: A survey
Humans can naturally and effectively find salient regions in complex scenes. Motivated by
this observation, attention mechanisms were introduced into computer vision with the aim of …
this observation, attention mechanisms were introduced into computer vision with the aim of …
Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives
Transformer, one of the latest technological advances of deep learning, has gained
prevalence in natural language processing or computer vision. Since medical imaging bear …
prevalence in natural language processing or computer vision. Since medical imaging bear …
Segment everything everywhere all at once
In this work, we present SEEM, a promotable and interactive model for segmenting
everything everywhere all at once in an image. In SEEM, we propose a novel and versatile …
everything everywhere all at once in an image. In SEEM, we propose a novel and versatile …
Internimage: Exploring large-scale vision foundation models with deformable convolutions
Compared to the great progress of large-scale vision transformers (ViTs) in recent years,
large-scale models based on convolutional neural networks (CNNs) are still in an early …
large-scale models based on convolutional neural networks (CNNs) are still in an early …
Segnext: Rethinking convolutional attention design for semantic segmentation
We present SegNeXt, a simple convolutional network architecture for semantic
segmentation. Recent transformer-based models have dominated the field of se-mantic …
segmentation. Recent transformer-based models have dominated the field of se-mantic …
Depth anything: Unleashing the power of large-scale unlabeled data
Abstract This work presents Depth Anything a highly practical solution for robust monocular
depth estimation. Without pursuing novel technical modules we aim to build a simple yet …
depth estimation. Without pursuing novel technical modules we aim to build a simple yet …
Cross-city matters: A multimodal remote sensing benchmark dataset for cross-city semantic segmentation using high-resolution domain adaptation networks
Artificial intelligence (AI) approaches nowadays have gained remarkable success in single-
modality-dominated remote sensing (RS) applications, especially with an emphasis on …
modality-dominated remote sensing (RS) applications, especially with an emphasis on …
Segment anything model for medical image analysis: an experimental study
Training segmentation models for medical images continues to be challenging due to the
limited availability of data annotations. Segment Anything Model (SAM) is a foundation …
limited availability of data annotations. Segment Anything Model (SAM) is a foundation …
Convolutions die hard: Open-vocabulary segmentation with single frozen convolutional clip
Open-vocabulary segmentation is a challenging task requiring segmenting and recognizing
objects from an open set of categories in diverse environments. One way to address this …
objects from an open set of categories in diverse environments. One way to address this …
Oneformer: One transformer to rule universal image segmentation
Abstract Universal Image Segmentation is not a new concept. Past attempts to unify image
segmentation include scene parsing, panoptic segmentation, and, more recently, new …
segmentation include scene parsing, panoptic segmentation, and, more recently, new …